Trajectory Is the New Interface

In the chat era, the interface between an AI system and its user was a string. The model produced text; the user read it. The whole product, on both sides, fit into a single rectangular box. This was a pleasant arrangement and it made for great demos.

In the agent era, the interface is a trajectory. The model produces a sequence of decisions, tool calls, observations, and corrections — and the user, increasingly, watches this sequence rather than just reading its final output.

The unit of communication is no longer “what the model said” but “what the model did.”

This change is more consequential than it sounds, because it reshapes what people expect to see, what they trust, and what they’re willing to pay for.

Showing the work is the product

The first time most people noticed this was probably with coding agents. When Claude Code or Cursor’s agent mode does a substantial task, the user watches it work. They see the files it opens, the tests it runs, the edits it proposes, the errors it catches, the dead ends it backs out of. This is not incidental. Showing the work is the product. A coding agent that just dropped a finished diff into your repo with no visible reasoning would feel suspicious, even if the diff were perfect. The visible trajectory is what makes the agent trustworthy enough to delegate to.

There’s a reason for this beyond aesthetics. When the model occasionally produces wrong output — and it does — the trajectory is the only way to figure out what happened. If you only saw the final answer, the failure is opaque: you don’t know if the agent misunderstood the request, picked the wrong tool, mis-read a file, or skipped a verification step. With the trajectory visible, you can locate the error precisely. The trajectory becomes the equivalent of a stack trace for AI behavior. Debugging an agent without trajectory visibility is roughly as hard as debugging a server without logs.

Coherence is a stronger property than accuracy

The shift from output-as-interface to trajectory-as-interface also changes what counts as a “good” agent. In the output era, good meant accurate. In the trajectory era, good means coherent — the path the agent took makes sense, given the goal. An agent can be accurate by accident, and you’d never know unless the trajectory was visible. An agent that’s coherent will be approximately accurate most of the time, and when it’s wrong, you can see where it went wrong and adjust. Coherence is a stronger property than accuracy because it explains itself.

This is also why “explanation” stopped being a separable feature for agents. In the prompt era, products built feature pages around “explainable AI” — separate views, often retrofitted, that tried to justify the model’s output. Trajectories made these obsolete by construction. The explanation isn’t a feature; it’s the form the work takes. You don’t need to ask the model to explain itself, because what the model did is right there.

Every agent product now needs a viewer

The thing trajectories require that outputs don’t is a viewer. Agents have an entirely new presentation problem: how do you show a fifty-step trajectory in a way a human can absorb? The answers are still being worked out. Some products show the full log, scrolled chronologically. Some collapse it into a structured tree of steps. Some surface only the high-level decisions and hide the low-level execution.

The right answer is probably task-dependent and will diverge for different agent types — a coding agent’s trajectory wants to be shown as a sequence of file edits and test runs, while a research agent’s trajectory wants to be shown as a citation graph. But every agent product is going to need a trajectory presentation layer, and the design discipline around this is still in its infancy.

Trajectories are contracts, not telemetry

There’s a deeper consequence here, which is that trajectories are contracts. When an agent works on a task and produces a visible trajectory, the trajectory is what you’ll be held accountable for if something goes wrong. This is true in the ordinary sense — you can review what the agent did before approving it — and also in a stronger sense: the trajectory is auditable evidence of how a decision was made, which matters for compliance, for legal liability, for trust with customers. In regulated industries, the trajectory may eventually carry more weight than the output. “The model produced this answer” is a weak claim. “The agent followed this verifiable sequence of steps, each of which is logged, with these specific external sources consulted, and these specific checks performed, before producing this output” is a much stronger one.

The trajectory-as-interface idea also reshapes how agents should be built, not just how they should be shown. If the trajectory is the artifact, then the trajectory should be designed, not merely emitted. Good trajectories are short, well-structured, and self-documenting. Bad trajectories meander, repeat themselves, and leave the reader to figure out what was going on. Harnesses can influence this — by enforcing structure on tool call results, by formatting summaries at key junctures, by limiting redundancy — and the harnesses that take trajectory design seriously produce work that’s both more reliable and more reviewable.

There’s also a curious side effect: trajectories are training data. Once you’ve shipped a system that produces trajectories, you have a corpus of how the model behaves in your specific environment with your specific tools. This is exactly the kind of data that’s useful for fine-tuning, for evaluation, for debugging, and (with appropriate care) for distilling agent behavior into faster systems. The trajectory exhaust becomes a flywheel: ship an agent, collect its trajectories, use them to improve the next version of the agent. The interface and the training signal are the same thing.

A useful way to put all this is: in the chat era, the model was a function from prompt to output, and the function’s output was what mattered. In the agent era, the model is a participant in a process, and the process is what matters. We started by displaying outputs because we hadn’t built the substrate to display processes; now we have, and the process is what people want to see. The interface has caught up with the shape of the work.

What this means in practice is that anyone building agents in 2026 needs to take the trajectory layer seriously. It’s not telemetry. It’s not a debugging tool. It’s the product surface. The next post is about what happens when you take this seriously enough that evaluation also shifts to live there — when agent evals become behavioral instead of completion-based.