Confer – End to end encrypted AI chat

(confer.to)

57 points | by vednig 7 hours ago

15 comments

  • shawnz 2 hours ago
    I don't agree that this is end to end encrypted. For example, a compromise of the TEE would mean your data is exposed. In a truly end to end encrypted system, I wouldn't expect a server side compromise to be able to expose my data.

    This is similar to the weasely language Google is now using with the Magic Cue feature ever since Android 16 QPR 1. When it launched, it was local only -- now it's local and in the cloud "with attestation". I don't like this trend and I don't think I'll be using such products

    • liuliu 1 hour ago
      I agree it is more like e2teee, but I think there is really no alternative beyond TEE + anonymization. Privacy people want it locally, but it is 5 to 10 years away (or never, if the current economics works, there is no need to reverse the trend).
      • shawnz 1 hour ago
        There's FHE, but that's probably an even more difficult technical challenge than doing everything locally
      • ignoramous 1 hour ago
        > ... 5 to 10 years away (or never, if the current economics works...

        Think PCs in 5y to 10y that can run SoTA multi-modal LLMs (cf Mac Pro) will cost as much as cars do, and I reckon folks will buy it.

        • binary132 45 minutes ago
          ISTM that most people would rather give away their privacy than pay even a single cent for most things.
    • 2bitencryption 1 hour ago
      if (big if) you trust the execution environment, which is apparently auditable, and if (big if) you trust the TEE merkle hash used to sign the response is computer based on the TEE as claimed (and not a malicious actor spoofing a TEE that lives within an evil environment) and also if you trust the inference engine (vllm / sglanf, what have you) then I guess you can be confident the system is private.

      Lots of ifs there, though. I do trust Moxie in terms of execution though. Doesn’t seem like the type of person to take half measures.

    • derefr 1 hour ago
      "Server-side" is a bit of a misnomer here.

      Sure, for e.g. E2E email, the expectation is that all the computation occurs on the client, and the server is a dumb store of opaque encrypted stuff.

      In a traditional E2E chat app, on the other hand, you've still got a backend service acting as a dumb pipe, that shouldn't have the keys to decrypt traffic flowing through it; but you've also got multiple clients — not just your own that share your keybag, but the clients of other users you're communicating with. "E2E" in the context of a chat app, means "messages are encrypted within your client; messages can then only be decrypted within the destination client(s) [i.e. the client(s) of the user(s) in the message thread with you.]"

      "E2E AI chat" would be E2E chat, with an LLM. The LLM is the other user in the chat thread with you; and this other user has its own distinct set of devices that it must interact through (because those devices are within the security boundary of its inference infrastructure.) So messages must decrypt on the LLM's side for it to read and reply to, just as they must decrypt on another human user's side for them to read and reply to. The LLM isn't the backend here; the chat servers acting as a "pipe" are the backend, while the LLM is on the same level of the network diagram as the user is.

      Let's consider the trivial version of an "E2E AI chat" design, where you physically control and possess the inference infrastructure. The LLM infra is e.g. your home workstation with some beefy GPUs in it. In this version, you can just run Signal on the same workstation, and connect it to the locally-running inference model as an MCP server. Then all your other devices gain the ability to "E2E AI chat" with the agent that resides in your workstation.

      The design question, being addressed by Moxie here, is what happens in the non-trivial case, when you aren't in physical possession of any inference infrastructure.

      Which is obviously the applicable case to solve for most people, 100% of the time, since most people don't own and won't ever own fancy GPU workstations.

      But, perhaps more interesting for us tech-heads that do consider buying such hardware, and would like to solve problems by designing architectures that make use of it... the same design question still pertains, at least somewhat, even when you do "own" the infra; just as long as you aren't in 100% continuous physical possession of it.

      You would still want attestation (and whatever else is required here) even for an agent installed on your home workstation, so long as you're planning to ever communicate with it through your little chat gateway when you're not at home. (Which, I mean... why else would you bother with setting up an "E2E AI chat" in the first place, if not to be able to do that?)

      Consider: your local flavor of state spooks could wait for you to leave your house; slip in and install a rootkit that directly reads from the inference backend's memory; and then disappear into the night before you get home. And, no matter how highly you presume your abilities to detect that your home has been intruded into / your computer has been modified / etc once you have physical access to those things again... you'd still want to be able to detect a compromise of your machine even before you get home, so that you'll know to avoid speaking to your agent (and thereby the nearby wiretap van) until then.

    • Stefan-H 1 hour ago
      Just like your mobile device is one end of the end-to-end encryption, the TEE is the other end. If properly implemented, the TEE would measure all software and ensure that there are no side channels that the sensitive data could be read from.
      • paxys 1 hour ago
        By that logic SSL/TLS is also end-to-end encryption, except it isn't
        • Stefan-H 1 hour ago
          When the server is the final recipient of a message sent over TLS, then yes, that is end-to-end encryption (for instance if a load balancer is not decrypting traffic in the middle). If the message's final recipient is a third party, then you are correct, an additional layer of encryption would be necessary. The TEE is the execution environment that needs access to the decrypted data to process the AI operations, therefore it is one end of the end-to-end encryption.
          • shawnz 1 hour ago
            This interpretation basically waters down the meaning of end-to-end encryption to the point of uselessness. You may as well just say "encryption".
            • Stefan-H 1 hour ago
              E2EE is usually applied in contexts where the message's final recipient is NOT the server on the other end of a TLS connection, so yes, this scenario is a stretch. The point is that in the context of an AI chat app, you have to decide on the boundary that you draw around the server components that are processing the request and necessarily need access to decrypted data, and call that one "end" of the connection.
          • paxys 1 hour ago
            No need to make up hypotheticals. The server isn't the final destination for your LLM requests. The reply needs to come back to you.
            • charcircuit 57 minutes ago
              If Bob and Alice are in an E2EE chat Bob and Alice are the ends. Even if Bob asks Alice a question and she replies back to Bob, Alice is still an end.

              Similarly with AI. The AI is one of the ends of the conversation.

  • jeroenhd 5 hours ago
    An interesting take on the AI model. I'm not sure what their business model is like, as collecting training data is the one thing that free AI users "pay" in return for services, but at least this chat model seems honest.

    Using remote attestation in the browser to attest the server rather than the client is refreshing.

    Using passkeys to encrypt data does limit browser/hardware combinations, though. My Firefox+Bitwarden setup doesn't work with this, unfortunately. Firefox on Android also seems to be broken, but Chrome on Android works well at least.

  • datadrivenangel 5 hours ago
    Get a fun error message on debian 13 with firefox v140:

    "This application requires passkey with PRF extension support for secure encryption key storage. Your browser or device doesn't support these advanced features.Please use Chrome 116+, Firefox 139+, or Edge 141+ on a device with platform authentication (Face ID, Touch ID, Windows Hello, etc.)."

    • butz 47 minutes ago
      Great new way to lock out potential new users. I bet large part of users interested in privacy are using Linux and some fork of Firefox.
    • crtasm 2 hours ago
      That is funny it won't even show us the homepage.

      We are allowed into the blog though! https://confer.to/blog/

    • Marsymars 1 hour ago
      I'm getting that that on macOS with Firefox 139+, for whatever reason...
  • throwaway35636 20 minutes ago
    Interestingly the confer image on GitHub doesn’t seem to include in the attestation the model weights (they seem loaded from a mounted ext4 disk without dm-verity). Probably this doesn’t compromise the privacy of the communication (as long as the model format is not containing any executable part) but it exposes users to a “model swapping” attack, where the confer operator makes a user talk to an “evil” model without they can notice it. Such evil model may be fine tuned to provide some specifically crafted output to the user. Authenticating the model seems important, maybe it is done at another level of the stack?
  • JohnFen 6 hours ago
    Unless I misunderstand, this doesn't seem to address what I consider to be the largest privacy risk: the information you're providing to the LLM itself. Is there even a solution to that problem?

    I mean, e2ee is great and welcome, of course. That's a wonderful thing. But I need more.

    • roughly 4 hours ago
      Looks like Confer is hosting its own inference: https://confer.to/blog/2026/01/private-inference/

      > LLMs are fundamentally stateless—input in, output out—which makes them ideal for this environment. For Confer, we run inference inside a confidential VM. Your prompts are encrypted from your device directly into the TEE using Noise Pipes, processed there, and responses are encrypted back. The host never sees plaintext.

      I don’t know what model they’re using, but it looks like everything should be staying on their servers, not going back to, eg, OpenAI or Anthropic.

      • JohnFen 1 hour ago
        > Looks like Confer is hosting its own inference

        Even so, you're still exposing your data to Confer, and so you have to trust them that they'll behave as you want. That's a security problem that Confer doesn't help with.

        I'm not saying Confer isn't useful, though. e2ee is very useful. But it isn't enough to make me feel comfortable.

        • internet_points 43 minutes ago
          > you're still exposing your data to Confer

          They use a https://en.wikipedia.org/wiki/Trusted_execution_environment and iiuc claim that your client can confirm (attest) that the code they run doesn't leak your data, see https://confer.to/blog/2026/01/private-inference/

          So you should be able to run https://github.com/conferlabs/confer-image yourself and get a hash of that and then confer.to will send you that same hash, but now it's been signed by Intel I guess? to tell you that yes not only did confer.to send you that hash, but that hash is indeed a hash of what's running inside the Trusted Execution Environment.

          I feel like this needs diagrams.

          • binary132 29 minutes ago
            As I read it, the attestation is simply that the server is running a particular kernel and application in the Secure Enclave using the hardware’s certification. That does not attest that there is no sidechannel. If exfiltration from the TEE is achieved, the attestation will not change.

            To put it another way, I am quite sure that a sufficiently skilled (or privileged: how do you know the manufacturer is not keeping copies of these hardware keys?) team could sit down with one of these enclave modules and figure out how to get the memory image (or whatever) out without altering the attested signature.

      • jeroadhd 1 hour ago
        That is a highly misleading statement: the GPU runs with real weights and real unencrypted user plaintext, since it has to multiply matrices of plain text, which is passed on to the supposedly "secure VM" (protected by Intel/Nvidia promises) and encrypted there. In no way is it e2e, unless you count the GPU as the "end".
        • AlanYx 1 hour ago
          It is true that nVidia GPU-CC TEE is not secure against decapsulation attacks, but there is a lot of effort to minimize the attack surface. This recent paper gives a pretty good overview of the security architecture: https://arxiv.org/pdf/2507.02770
        • Imustaskforhelp 1 hour ago
          So what you are saying is that all the TEE and remote attestation and everything might work for CPU based workflows but they just don't work with GPU effectively being unencrpyted and anyone can read it from there?

          Edit: https://news.ycombinator.com/item?id=46600839 this comment says that the gpu have such capabilities as well, So I am interested what you were mentioning in the first place?

      • dang 2 hours ago
        We'll add that link to the toptext as well. Thanks!

        (It got submitted a few times but did not get any comments - might as well consolidate these threads)

  • paxys 1 hour ago
    "trusted execution environment" != end-to-end encryption

    The entire point of E2EE is that both "ends" need to be fully under your control.

    • optymizer 48 minutes ago
      This is false.

      From Wikipedia: "End-to-end encryption (E2EE) is a method of implementing a secure communication system where only the sender and intended recipient can read the messages."

      Both ends do not need to be under your control for E2EE.

    • Stefan-H 1 hour ago
      The point of E2EE is that only the people/systems that need access to the data are able to do so. If the message is encrypted on the user's device and then is only decrypted in the TEE where the data is needed in order to process the request, and only lives there ephemerally, then in what way is it not end-to-end encrypted?
      • paxys 50 minutes ago
        Because anyone with access to the TEE also has access to the data. The owners can say they won't tamper with it, but those are promises, not guarantees.
  • orbital-decay 42 minutes ago
    At least Cocoon and similar services relying on TEE don't call this end-to-end encryption. Hardware DRM is not E2EE, it's security by obscurity. Not to say it doesn't work, but it doesn't provide mathematically strong guarantees either.
  • jdthedisciple 43 minutes ago
    The best private LLM is the one you host yourself.
  • slipheen 1 hour ago
    Does it say anywhere which model it’s using?

    I see references to vLLM in the GitHub but not which actual model (Llama, Mistral, etc.) or if they have a custom fine tune, or you give your own huggingface link?

  • hiimkeks 2 hours ago
    I am confused. I get E2EE chat with a TEE, but the TEEs I know of (admittedly not an expert) are not powerful enough to do the actual inference, at least not any useful one. The blog posts published so far just glance over that.
  • letmetweakit 1 hour ago
    How does inference work with a TEE, isn’t performance a lot more restricted?
  • AdmiralAsshat 8 hours ago
    Well, if anyone could do it properly, Moxie certainly has the track record.
  • f_allwein 7 hours ago
    Interesting! I wonder a) how much of an issue this addresses, ie how much are people worried about privacy when they use other LLMs? and b) how much of a disadvantage it is for Confer not to be able to read/ train in user data.
  • LordDragonfang 1 hour ago
    > Advanced Passkey Features Required

    > This application requires passkey with PRF extension support for secure encryption key storage. Your browser or device doesn't support these advanced features.

    > Please use Chrome 116+, Firefox 139+, or Edge 141+ on a device with platform authentication (Face ID, Touch ID, Windows Hello, etc.).

    (Running Chrome 143)

    So... does this just not support desktops without overpriced webcams, or am I missing something?

    • literalAardvark 46 minutes ago
      Windows Hello should work fine just by PIN, it's the platform authentication part that's important, not the way you unlock it
  • jeroadhd 2 hours ago
    Again with the confidential VM and remote attestation crypto theater? Moxie has a good track record in general, and yet he seems to have a huge blindspot in trusting Intel broken "trusted VM" computing for some inexplicable reason. He designed the user backups of Signal messages to server with similar crypto secure "enclave" snake-oil.
    • tkz1312 1 hour ago
      AFAIK the signal backups use symmetric encryption with user generated and controlled keys and anonymous credentials (https://signal.org/blog/introducing-secure-backups/). Do you have a link about the usage of sgx there?

      Also fwiw I think tees and remote attestation are a pretty pragmatic solution here that meaningfully improves on the current state of the art for llm inference and I'm happy to see it.

    • liuliu 1 hour ago
      I think there is only so much you can do practically. Without a secure "enclave", there isn't really much you can do. What's your alternative?