Whitepaper · v1.0 · 2026

The Amnesia Tax

Why AI coding agents keep re-learning the same lessons — and what that costs a team that runs them all day, every day.

Anyone running Claude Code or Cursor across a team has watched this happen: one engineer's agent spends ten minutes re-deriving something another engineer's agent already nailed down the day before, because there was nowhere for that answer to live where an agent could read it back. Here's what that actually costs, and what we built to fix it.

Read the paper →

Contents

1. The problem: agents forget, teams pay for it
2. The amnesia tax, quantified
3. Where the waste actually lives
4. Shared memory as infrastructure
5. Why vendor-agnostic matters
6. What a team gets back
7. What does and doesn’t leave your machine
8. Objections, answered directly
9. Getting started

1. The problem: agents forget, teams pay for it

AI coding agents — Claude Code, Cursor, and everything else built on the Model Context Protocol (MCP) — are genuinely good within a session. They can read a codebase, hold a lot of it in context, and reason through a gnarly bug faster than most humans would. Then the session ends, and all of that gets thrown away.

The next session, whether it's the same engineer tomorrow morning or a teammate an hour later, starts from a blank slate. It doesn't know that the team already tried switching the retry backoff to exponential and it caused a thundering herd in production. It doesn't know the payments service intentionally violates the lint rule about floating-point currency math because the accounting team requires it. It doesn't know that “just bump the Node version” was tried three weeks ago and broke the Lambda cold-start budget.

So it re-derives all of that. Sometimes it gets there fine, just slower — burning tokens and minutes to rediscover something the team already knew. Sometimes it gets there wrong, and reintroduces a fix that was already tried and reverted for a reason nobody told it about. Multiply either version across every engineer's agent, every session, every day, and you've got a real ongoing cost that never shows up as a single line item — we call it the amnesia tax.

The models aren't the bottleneck here. The gap is that agents have nowhere durable and shared to put what they learn. A scratch file in one engineer's home directory doesn't reach a teammate's machine. A chat transcript in Claude Code doesn't reach a Cursor session. Code comments, when they exist at all, explain what the code does — almost never why it's written that way, and never what the team already tried and ruled out.

2. The amnesia tax, quantified

We're not going to hand you a made-up industry statistic like “teams waste 34% of their AI spend on redundant context.” Nobody has that number, and anyone who quotes one to you made it up. What you can reason about directly is where the cost actually comes from:

Tokens. Re-deriving something from scratch means an agent has to read the relevant files again and reason through them again — that's input tokens for every file it opens and output tokens for the reasoning trace. Recalling a written-down answer is one short query. The gap between those two, per occurrence, is usually an order of magnitude or more, though the exact number depends entirely on how deep the original investigation was.
Time. The token bill is the part you can see on an invoice. The part you feel is the engineer sitting there while the agent re-explores a codebase it's effectively already explored, on someone else's machine, last week.
Wrong answers. This is the expensive one. An agent that doesn't know a fix was already tried and reverted will happily propose it again, with full confidence, and nothing in its context will contradict it.

3. Where the waste actually lives

Watch a few agent sessions back to back on the same repo and the pattern is obvious:

The first few minutes of every session

A lot of the early tool calls in a fresh session are the agent re-establishing things a previous session already established — the deploy target, a convention, why some piece of code looks off. None of that is new work. It's the same homework, redone.

Claude Code and Cursor not talking to each other

If you and a teammate are on different tools, your agents are working the same repo with zero visibility into each other. Someone has to manually write it in Slack and hope the other person reads it before they hit the same wall — which mostly doesn't happen in time.

The one person who remembers why

Every team has that engineer who knows why the retry logic looks strange, and if they're on PTO or the thread has scrolled past, that context is just gone until they're back.

4. Shared memory as infrastructure

None of this needs a smarter model to fix. It needs a place for the team's agents to put what they learn — something small and fast enough that writing to it is free of friction and querying it is cheaper than re-deriving the answer.

That's threadctx: an MCP server that gives AI coding agents durable, team-shared memory. When an agent figures out something worth keeping — a root cause, a convention, why a decision went the way it did — it writes a short entry scoped to that repo. The next agent to touch the repo, on any tool, for any teammate, can pull it back with one query instead of rebuilding it from nothing.

The entries stay small on purpose — a sentence or two, not a transcript dump and definitely not your source code. Think of it as the note a good senior engineer would leave in a Slack thread, except an agent actually reads it before repeating the mistake.

5. Why vendor-agnostic matters

Most engineering teams aren't on one AI coding tool anymore. Some people are on Claude Code, some are on Cursor, and whoever's hiring next quarter will probably bring in something else. If your memory layer only works inside one of those tools, it's not really a team memory layer — it's one person's notes with a nicer UI.

threadctx runs on the Model Context Protocol, the open standard Anthropic introduced for connecting agents to external tools and data, which Cursor and most of the rest of the ecosystem have since adopted too. In practice that gives you three things:

One install, every client. The same threadctx config works in Claude Code and Cursor today, and in whatever MCP-speaking tool your team picks up next.
Nothing to migrate away from. Local mode is free, MIT-licensed, and stores memory as plain JSON on disk (~/.threadctx/local.json). You can read it, grep it, or delete it whenever you want.
You're not betting on our roadmap. If we disappeared tomorrow, the protocol and your local memory files would still be yours. Try saying that about a memory feature bolted onto a single vendor's editor.

6. What a team gets back

Concretely, here's what changes:

Without shared memory	With threadctx
Every engineer's agent re-derives the same context	First agent to hit it writes it once, everyone recalls it
Tokens spent reconstructing known answers	A single cheap recall query replaces re-derivation
A fixed bug quietly reappears in a different PR	The agent recalls “we tried that, it broke prod” before proposing it
New hire's agent knows nothing about the codebase's history	New hire's agent inherits the team's accumulated memory on day one
Knowledge locked to one tool, one person, one chat log	Knowledge is a team asset, portable across tools and people

The part worth noticing: someone pays the cost of figuring something out exactly once. After that, it's a cheap read for everyone else on the repo, indefinitely. A team of five doesn't just save five times what a solo developer saves — every extra person you add is another agent that might hit the same wall, and now they don't have to.

7. What does and doesn't leave your machine

The obvious question is what a tool like this actually sends off your machine, so here's the honest answer:

Not your source code. Entries are short, agent-written learnings — not file contents, not diffs, not the codebase itself.
Local mode never phones home. The only network calls it makes are whatever your LLM provider already requires. Memory sits in a JSON file on your machine and nowhere else.
Cloud mode is scoped per repo and per team. Only entries your own team wrote for that specific repo come back on a query, and only with your API key.

8. Objections, answered directly

“Isn't this just a vector database with extra steps?”

A vector database is a piece of the plumbing, sure. What you actually get is the MCP wiring agents already speak, per-repo scoping, an identity that follows you across Claude Code and Cursor, and a team billing model — the stuff that turns 'store some embeddings' into something a team can point at a repo in one command.

“Won't this just fill up with junk over time?”

Entries only get written when an agent actually learns something, and they're short — a sentence or two, not a session transcript. Recall is scoped to what the current task needs, so old irrelevant entries don't get dragged into context just because they exist.

“What if we outgrow it, or just want out?”

Local mode is MIT-licensed. Fork it, self-host it, or open the JSON file directly and read it yourself. We didn't invent a proprietary format to lock you into, so there's nothing to migrate away from.

9. Getting started

Solo and local memory is free, open source, and installs in one command — no signup required.

$ npx threadctx-mcp

When your team is ready to share memory across people and tools, start a Team trial.

Found this useful? Share it with your team — or with whoever owns the AI tooling budget.

Share on X Share on LinkedIn Submit to HN