← Field Notes
Architecture

Federated rule learning, without ever reading your mail

2026-04-26 · 8 min read

Most email clients sort your inbox using static rules — "if From contains @newsletter.com, file to Promotions." Those rules don't get smarter. The big providers solve that with machine learning trained on your messages: their algorithm reads your mail, learns from millions of users' behavior, and gets better. The trade you make is reading rights. We don't think that trade is necessary. Here's a different way.

The premise: privacy and intelligence aren't opposed

When a junk newsletter shows up in everyone's inbox and everyone moves it to Promotions, that pattern is a useful signal — and it's a pattern about the sender, not about any individual recipient. The sender is shouting their identity in headers like List-Unsubscribe and a domain everyone can see. Nothing that looks at those signals is reading your mail.

The trick, then, is to learn from collective behavior over public-by-construction signals — and to make it structurally impossible for the system to ever look at private signals. Not "trust us we won't," but "the type system literally does not have a way to."

What "public-by-construction" means in practice

When you flag a message as Promotions, the only things our client ever derives from that action are a small set of header features:

  • What domain it came from (e.g., @mailchimp.com) — already on the envelope, visible to every relay it touched.
  • Whether the message advertised a List-Unsubscribe header — a public IETF-standard signal the sender chose to publish.
  • Whether it carried a high-priority hint.
  • A handful of subject keywords the sender themselves wrote into the header.

That's the entire feature surface. Bodies, addresses, names, the recipient list — none of those exist in the structure that gets shared. Not as a policy. As a property of the design: there is no way to put them in. Anyone who reviews the code (auditors, regulators, you) can verify that in a few minutes.

How the collective gets smarter

Each install observes its user's flags privately and derives candidate sorting rules — "this user moves @mailchimp.com to Promotions every time." Those rules apply locally immediately. If the user opts in, a copy is signed with a per-device key and submitted to a public corroboration ledger.

The ledger only counts. It does not store who you are, when exactly you submitted, what your IP was, or what's in any message. It tracks, per rule pattern, how many independent installs have proposed it. When enough installs converge on the same pattern — and we mean really converge, not just one user with multiple devices — the rule surfaces in everyone else's client as a suggestion. One click to accept; one click to reject. Local rules always beat platform suggestions; suggestions can never promote a sender into your important inbox.

The result: Gmail-quality sorting that gets smarter from the network without anyone — including us — reading anyone's mail.

What this is not

It's not magic and it's not machine learning. There's no model. The patterns are simple typed rules — exactly the kind of "if header X matches Y, file to Z" that mail filters have used since the 1990s. The novelty is the signing-and-corroboration layer, the careful refusal to ship anything else, and the public verifiability of both.

It also isn't a silver bullet for every classification problem. Spam filtering, for instance, has to look at content because spam evades static rules; that's a different problem and we don't claim to solve it this way. What we do claim is that ordinary sorting — the difference between a Mailchimp blast and a real customer email — does not need anyone reading your mail.

Why we think this matters

There's a category of company that builds infrastructure on the premise that privacy is a niche concern — that ordinary users will trade reading rights for better features. We don't think that's true; we think those features can be delivered another way, and the only reason they aren't is that the alternative is harder to design and harder to monetize.

PlausiDen builds infrastructure for organizations that already know they're in the other category — law firms, healthcare practices, journalists, financial advisors, advocacy groups — where "don't read our data" is the requirement, not the upsell. The federated rule loop is one of several pieces in a larger stack we use to deliver the experience their users expect without compromising the structural guarantees they need.

The shape of "compose, don't compromise"

If you've worked with us, you've heard us use the phrase "compose, don't compromise." Federated learning over public signals is the canonical example of what that phrase means. Two things that look mutually exclusive — privacy-as-a-property vs. continuous improvement — turn out to be compatible if you accept a constraint and design around it. The constraint is "never look at body content for routing." The design that survives that constraint produces a system that's actually more auditable than the alternative, because the privacy guarantee lives in the type system rather than in a policy.

Most of what we build for clients shares this structure. We don't promise privacy as a courtesy and ask you to trust us; we design pipelines where the promise is provable from the code. That makes audits faster, compliance reviews shorter, and incident response cheaper, because the answer to "could this system have leaked X?" is sometimes "no — by construction" instead of "let me check the logs."

Where this lives

Today the federated rule loop runs inside our internal tooling stack. It's the kind of work we do when we're solving a problem for a client that wants smarter classification but cannot allow content access — and we've found that articulating it as a general primitive, rather than a one-off, makes it useful across multiple engagements. Over time, pieces of this stack will become open source; we're being patient about timing because building them properly is a real investment, and we'd rather get sustainable revenue first than ship half a system into the world for free.

If you're working on something where the privacy / capability trade-off feels false — where you've been told you have to choose between sovereignty and good UX — we can probably help. The conversation usually starts with a specific bottleneck and ends with a different design choice you didn't know was on the table.

Working on something where this kind of thinking matters? Get in touch.