The Harness

The model is the brain. The harness is everything you build around it so the brain can actually do work. Today the model is rarely what holds you back. The harness is.

What a harness is

The model is the intelligence: it understands a request and produces a response. The harness is the scaffolding around it that lets the model do things: where it runs, what it remembers from one session to the next, which tools it can reach, and how its work gets checked. If the model is the brain, the harness is the hands, the memory, and the workspace.

The model decides how smart the AI is. The harness decides whether that intelligence turns into useful work.

The proof: same model, very different results

In one published benchmark, the same Claude model, identical weights, scored 78% running inside Claude Code and 42% inside a different harness. Same brain, nearly double the output, purely because of what was built around it.

That gap is the whole point. The harness is not a finishing touch on top of a good model. It is a multiplier on everything the model can do.

A separate team reached the same conclusion a different way. Cursor built a strong custom harness around a single model and set it loose on an extreme task: writing a working web browser from scratch. It ran for weeks and produced three million lines of working code.

We are in the harness age

For years the race was about the model: who has the smartest one, who tops the benchmarks. That race is largely over. The leading models are all extremely capable and increasingly alike. What still differs, sharply, is the harness. That is where the real differences now live, and where the advantage is won.

The model is becoming a commodity you can buy. The harness is something you build, and it compounds: every workflow, check, and piece of context a team builds around its tools makes the next bit of work better, and a competitor cannot buy that overnight. This is why a capable team with average models and a strong harness beats a team with the best models and no harness.

Part of the harness is temporary, part is durable

Not all of a harness is worth the same investment. Some of it exists only to patch what today’s models cannot do reliably yet, and that part should be expected to shrink and get thrown away as the models improve.

The durable part is the layer that keeps paying off as models get smarter: the context you assemble, the skills and judgment you encode, the workflows you standardize. The same encoded skill produces better results the moment a smarter model drops, with no change on your end. So invest in that layer, and build the patches lightly, knowing you will rebuild them in a few months.

The harness scales from the desk to the company

The same idea works at two sizes:

  • At your desk, your harness is the commands, skills, context, and checks you build around your AI so it does your work well. The redesign rules are how you build it.
  • At the company, the harness is the shared context every person’s AI can reach, the goals and judgment encoded so the work does not depend on who is in the room, and the workflows the whole organization runs on.

That company-level harness is what an AI-native company builds. You do not recognize one by whether it uses AI, pays for the best models, or is good at prompting. Plenty of companies do all three and still have only a pile of tools, not a system. You recognize it by whether it has built the harnesses. Building the harness is not all there is to becoming AI native. The judgment that builds and directs it matters just as much. But it is the central thing a company builds. And its people have made the same shift, from doing the work to building the systems that do it.

Sources