There’s a pattern that took a surprisingly long time for the AI field to name, even though it had been implicit in good harness design for years. The pattern is: don’t show the model everything it could possibly do. Show it the things relevant to the moment, with a way to discover the rest if it needs to.
The name that’s settled on this is progressive disclosure. It’s borrowed from UI design, where it describes the principle of revealing complexity only as the user needs it. In agent design it does the same job for capabilities: a small surface area is presented up front, and additional detail is loaded on demand. It sounds almost too simple to be a “primitive,” but its absence in the prompt era was responsible for a remarkable share of the field’s pathologies.
What goes wrong when the model sees everything
Consider what the alternative looks like. In a system without progressive disclosure, the model sees every tool, every skill, every constraint, every piece of context, on every turn. The system prompt lists all forty-three available tools with their full descriptions. The skills are all inlined into the prompt because there’s no mechanism for selective loading. The retrieval system returns the top twenty documents whether or not they’re relevant, because there’s no way for the model to ask for more if these aren’t enough. Everything is presented at maximum verbosity, all the time.
Three things go wrong, predictably. First, the context bloats — most of what’s loaded won’t be used, but you paid for it in tokens and in attention. Second, the model gets distracted — when forty-three tools are described, the model is statistically likely to use a wrong one on any given task. Third, the model gets timid — when every constraint is foregrounded, the model interprets normal actions as potentially constraint-violating and asks for confirmation about everything.
Progressive disclosure fixes all three by changing the basic interaction pattern: the model doesn’t get the full library, it gets a card catalog. To use a specific item, it has to fetch it. The fetch is cheap and easy, but the act of fetching is meaningful — it means the model has decided this specific item is relevant. The model’s context contains only what it has chosen to load, which is dramatically less than what it could theoretically use.
The model doesn’t get the full library, it gets a card catalog.
A card catalog, not the full library
The mechanism is usually some variant of: the harness exposes a small set of meta-tools, the most important of which is a tool that lists or searches the available skills or capabilities. The model uses this tool to discover what’s relevant. The result is a list of short descriptions. The model picks one and fetches the full content. Now its context contains a focused, task-relevant set of instructions, not the whole library.
This is exactly how a human consultant operates when they walk into a new client. They don’t memorize the company’s entire process manual on day one. They learn the structure, and then they fetch the specific chapter they need when a question comes up. The system has a small permanent footprint and a large lazy footprint. Most of what the consultant knows is “where to look,” not “what to do.”
The reliability benefits compound in ways that are easy to underestimate. When the model is loading specific skills for specific tasks, the relevant instructions are at the top of context, freshly stated, and unaccompanied by competing concerns. This is exactly the regime where models perform best. Instructions in this position are followed crisply. Compare to the system-prompt regime, where the same instructions are mixed with thirty-seven others, all stated days ago in token-distance terms, and the model is implicitly weighing them all against each other on every turn.
An unseen capability is an unusable one
There’s a safety angle too, and it’s the under-appreciated one. The model can’t accidentally use a capability it can’t see. In a non-progressive-disclosure setup, if you don’t want the model to ever delete files, you have to write a long instruction about not deleting files, hope the model reads it, and accept that the instruction can be overridden by a sufficiently determined prompt injection. In a progressive-disclosure setup, you simply don’t put the deletion capability in the discoverable surface for tasks where it shouldn’t be available. The model isn’t refraining from the action because it was told not to. The action is not in the model’s affordance set at all. This is a categorically stronger guarantee.
Progressive disclosure also produces better trajectories for review. When you read an agent’s log, you can see exactly which skills it loaded, in what order, in response to what. The trajectory becomes self-documenting: the agent’s choices about what to fetch are visible reasoning steps. Compare to a static system prompt, where everything was available everywhere and the agent’s actions look unmotivated because the rationale is buried in eight thousand tokens of always-on context.
The library grows, the context doesn’t
If you’re designing a harness today and you don’t have progressive disclosure, you have a future scaling problem. The number of skills, tools, and contextual hints you want available to your agents will grow. The model’s context window will grow more slowly than your capability library will, even if context windows keep expanding, because the marginal value of additional content drops off long before the limit. Progressive disclosure is what lets the library grow without the context growing with it.
I think it’s likely that progressive disclosure becomes one of those concepts that, in five years, everyone treats as obvious and forgets had to be discovered. Like REST in web design, or pure functions in functional programming — once you have the concept, the systems built without it look broken in retrospect. The prompt-era practice of dumping every capability into a single string is going to age the same way. It already looks strange to read about, and it’s only been a couple of years since it was state of the art.
The cleanest summary I can give is: skills are the unit, progressive disclosure is the access pattern, and together they replace the system prompt with a library. Once you have a library you start designing the rest of the system differently. The next post is about one of the places that shift takes you most explicitly — toward agents that look, structurally, more like operating systems than like chatbots.