Essay · Cognition & AI · 14 min read

Reach

How breadth of mind decides which work survives the AI era, from the dormant reserve of your mind to the top rung of every ladder.

A weird thing about brains

You have roughly 86 billion neurons. Right now, thinking about whatever this sentence just made you think about, only a small fraction, call it something like 2 to 10 percent depending on how you count and what task you're doing, are active in the pattern carrying that thought. The rest of the system is quiet. Dark. In reserve.

This isn't inefficiency. If your brain lit up everything at once, you wouldn't think harder. You'd have a seizure.

That dormant reserve is what makes the active fraction meaningful. Cognition, it turns out, isn't about how much of your brain is on. It's about which small fraction fires, selected from a vast reserve of possibilities, most of which never activate for any given thought.

The dark reserve is where the answer could have come from. That optionality, the sheer amount of stuff you're not thinking about but could be, is most of what intelligence actually is.

Hold that picture. We're going to leave biology now and find the same shape in a completely different place.

Visual 1
A thought, visualized
Two views of the same phenomenon. On the left, a library of specific lived experience. On the right, a brain in cross-section. Pick a thought or let it cycle. Watch a small fraction activate in both views simultaneously. A tiny subset fires. The rest stays silent but present.
A library of lived experience
A brain, in cross-section

The same thing inside a modern AI model

Here's the thing that sounds like a coincidence and isn't: many of the frontier AI models doing the most impressive work right now use a structurally similar trick.

The old architecture (dense models) lights up every parameter for every query. Small, fast, always fully on. Fine for narrow tasks. The new architecture ("mixture of experts") works more like cortex: a vast parameter space, most of it dormant at any given moment. A routing layer decides which small fraction to activate for a given prompt. The rest stays dark.

You don't need to care about the mechanics. You need to notice the shape. Large-scale selective activation out of a huge latent reserve is a move your brain also makes, and it is increasingly how the most capable models are built. Whatever you believed about AI capability in 2022, when models were mostly dense and mostly small, you should update. The systems doing the most interesting cross-domain reasoning today look less like monolithic calculators and more like selective retrieval over a large reserve.

Which raises the obvious question: what does that dormant reserve actually buy you? A small model and a large sparse model can both answer a narrow question about manufacturing. What does the larger sparse model buy that the smaller one doesn't?

Visual 2
Two brains, same question
A smaller sparse model and a larger one, both given the same manufacturing problem. Watch what each one has inside to reach for, and what it doesn't.
QUERY
200B (Sonnet)
Smaller sparse MoE
Not in this model
Sports biomechanics
1T (Opus)
Larger sparse MoE
Parameter counts are illustrative, not exact. Anthropic has not disclosed the actual sizes of Sonnet or Opus. The point is the relative size and structure, not the specific numbers.

The golf swing in the factory

Here's a scene that makes it concrete. Illustrative, but the pattern is real.

A manufacturing engineer is debugging a process where a rotating component produces inconsistent output. The problem is subtle. Not a mechanical failure. A timing issue in how force gets distributed across the rotation.

A senior engineer walks in. Watches for thirty seconds. Says: "It's the same thing as coming over the top in a golf swing. Your force is arriving early in the rotation, not at the bottom. Delay the power delivery."

Visual 3
The same timing error in two substrates
A factory rotor and a golf swing, both rotating through the same phase. In the broken state, peak force arrives early, before the bottom of the rotation. In the fixed state, force arrives at the bottom. Toggle between them and watch both systems stay in sync.
Factory rotor
Golf swing
Rotation phase
Current angle Force peak

They fix it in an hour.

Look at what just happened. Two people in the room. The manufacturing engineer knew manufacturing better than anyone alive, every quirk of this particular process, every piece of equipment, every vendor's tolerances. They'd been stuck on the problem for a day. The senior engineer knew less about manufacturing, and solved it in thirty seconds. Same factory, same problem, two very different outcomes. What's the difference?

The senior engineer played golf on weekends. That's it. Somewhere in their dormant repertoire was a specific concept, coming over the top, force arriving early in the swing, that happened to rhyme with the factory's timing problem. The manufacturing engineer didn't have that concept available. They couldn't reach for it because it wasn't in their repertoire at all.

This is exactly the contrast between a small MoE model and a large one. A smaller model can know manufacturing perfectly well; its manufacturing experts activate, its answer to "fix this rotor" is sharp. What it doesn't have is the dormant reserve of sports-biomechanics weights that a larger sparse model carries around. The golf-swing pattern was either never learned or got compressed out during training: not load-bearing for any common task, so it got pruned. The smaller model optimized for the average case, and in doing so, tended to remove exactly the weird latent knowledge that makes cross-domain reach possible. The larger model kept those weights around. Most of the time they sit dark. Once in a while, a problem arrives where those exact weights are the ones that need to fire, and the larger model reaches. The smaller one can't.

This is where the distinction between tasks and jobs becomes load-bearing.

Task
Job
What it is
A problem inside a defined domain.
An open situation where you don't know which domain holds the answer.
Example
"Debug this manufacturing process using manufacturing knowledge."
"Figure out why the output is inconsistent."
What it needs
Local retrieval. Knowledge inside the domain.
Reach. The capacity to grab a pattern from far away in concept-space.
Who does it well
Small models. Specialists.
Large sparse models. Generalists with deep repertoires.

Jobs can't be done by systems with narrow repertoires, no matter how well post-trained. You can't train your way to knowledge that isn't there. You need a system with enough dormant material that the right distant connection is available to fire when the situation asks.

This is why large sparse models feel qualitatively different, not just incrementally better. Not because they know more, both big and small models know plenty. Because they kept more useless things around. And creativity, stripped to its mechanism, is the licensed retrieval of useless things at the right moment.

The visual below makes the mechanism concrete. Same question, a broken manufacturing rotation, handed to three models at different sizes. At 50 billion parameters, "sports motion" exists as a vague shape, and nothing specific reaches out from it. At 200 billion, manufacturing starts resolving into specifics, but sports is still a blur. At 1 trillion, something new happens: sports resolves into distinct concepts, and two separate paths converge on "golf swing." One from the query itself, one from the "timing" concept inside manufacturing that rhymes with the golf swing's force-arriving-early problem.

The triangle at 1T is the picture of cross-domain reach. Not one line but two, arriving from different origins at the same specific distant memory. The small model couldn't have drawn either of them.

One caveat worth naming. This visual is about reach specifically, not general capability. Larger models are also better at reasoning, calibration, and instruction-following. Those are real gains, living on different axes. The claim this essay is making is narrower: that cross-domain retrieval depends on having the specific distant concept available to retrieve, and bigger sparse models have more specific distant concepts available. That's it. Everything else large models do well is additional, not what this is about.

Visual 4
The same question, three model sizes
The query sits in the center. Manufacturing, its home domain, sits right beside it. Sports lives far away, at first as a blur. As the model grows, sports resolves into specific concepts, and two reach-lines converge on "golf swing." That convergence is what size actually buys.
Model size
music MUSIC cooking COOKING QUERY broken manufacturing

Pre-training, post-training, and the harness

Three terms are going to show up the rest of the way. Rather than define them abstractly, think about a chef.

A chef has three things running at the same time. A lifetime of food absorbed: every meal they've eaten, every grandmother's stew, every back-alley restaurant on some trip a decade ago. A professional tradition they've trained in: French brigade, Japanese kaiseki, Italian trattoria, with all its conventions about what "good" looks like and how a service works. And a specific kitchen they're standing in tonight: this restaurant, this menu, tonight's 40 reservations, this team, this equipment, this dinner rush.

In modern ML, those three layers are called pre-training, post-training, and the harness. They're running inside every AI system simultaneously. And here's the move that makes this section worth reading: they're running inside you too.

Here's the claim that makes this worth introducing:

Pre-training builds the repertoire. Post-training shapes how it's deployed. The harness defines the job. The repertoire is the expensive part. Everything else is cheap relative to building it.

You can't post-train your way out of a small repertoire. You can't harness your way out of it either. The harness assumes the repertoire is there to draw on. When it isn't, the harness just surfaces the gap faster.

Take a concrete example: a PE VP who becomes a great agent orchestrator. On the surface it looks like a pivot from finance to AI, entirely different jobs. Underneath it's a harness swap. The repertoire was already there: years of pattern recognition under uncertainty, decomposing messy real situations, sizing questions before asking them, knowing when an answer is good enough. The new harness gives different tools and a different success criterion. The cognitive engine is the same one.

The swap is cheap because the repertoire is broad enough to reach. Someone who only ever did PE diligence, without the broader pre-training, couldn't make the swap. The harness would sit empty. And in the other direction: a PE VP cannot become a literary translator the same way. The linguistic intuitions were never absorbed, and no harness conjures them into existence.

Harness swaps are cheap when repertoire is broad, expensive when narrow, and impossible when the relevant material simply isn't there. Same for models. Same for humans.

Worth naming before we go further. I've painted brain architecture and AI architecture in broad strokes, and actual neuroscientists and ML researchers would add texture to almost every claim above. Sparse activation in brains is more complicated than any simple "2 to 10% fire" framing. Experts in MoE models don't specialize by domain the way my visuals suggest. "Pre-training" and "post-training" are reasonable handles, not precise equivalences across humans and models. I'm reaching for the shape of an idea, not a textbook description of either system. The argument about repertoire, reach, and what the next decade of work rewards doesn't depend on the mechanical details being exact. If the shape rings true, that's what matters.

Visual 5
Three layers, three substrates, same shape
Pre-training, post-training, and the harness run inside a chef, inside a human professional, and inside an AI model. Read across each row to see the same layer in three different places. Then try swapping the professional's harness and watch what changes, and what doesn't.
Layer
A chef
A professional
An AI model
A lifetime absorbed
Pre-training
Vast, slow, dormant until summoned
Every meal, every trip, every grandmother
mole miso pâté grandma's stew that Oaxacan spot ramen at 2am dim sum masa baguette + thousands
Everything absorbed growing up
grandfather's job stories high-school chem summer chess tournaments a bad relationship two languages college calculus backpacking through Italy dad's woodshop + thousands
Unsupervised training on human output
Wikipedia GitHub arXiv Project Gutenberg news archives StackOverflow recipe blogs forum threads + trillions of tokens
Professional defaults
Post-training
Shapes how the repertoire gets deployed
Conventions of the trade
French brigade: station hierarchy, mise en place
Japanese kaiseki: seasonal discipline
Italian trattoria: family service, simple plates
First years on the job
Pattern recognition under uncertainty
Decomposing messy situations
Sizing questions before asking
Fine-tuning on preferred behavior
Instruction tuning: follow what's asked
RLHF: preferences learned from feedback
Safety tuning: when to decline
TONIGHT
Tonight's context
The harness
Specific, swappable, cheap to change
↓ try swapping any harness
This kitchen, this service
Thursday · 7pm · Brooklyn bistro
40 covers · 4 line cooks · tonight's menu
Your current job
PE VP at a mid-market fund
Q3 deal pipeline · IC reviews · 4 analysts
This session, this task
"Debug this code"
terminal · 200k tokens · no web search
Three layers, three substrates, same shape. The pre-training layer is vast, slow, and expensive. Post-training shapes how it gets deployed. The harness is whatever context you're in right now. A chef moves kitchens, a professional changes jobs, an AI gets a new system prompt, but no amount of harness change fills a repertoire that isn't there. The inverse also holds: a deep repertoire fills a new harness beautifully, which is why the PE VP can become an agent orchestrator.

The two ladders meet in the middle

Section 4 treated a harness as the runtime context of a single job: this kitchen tonight, this role this quarter. Now step back. A career is a sequence of jobs stacked over years, a long container that shapes the jobs inside it. And careers, like jobs, have internal structure worth looking at.

There are two ways people end up running things.

One path starts in code: writing, reviewing, scoping, architecting. The other path starts in financial models: building, reading, structuring, deciding. Two totally different worlds with different credentials, day-to-day texture, networks, wardrobes. Each career is a long container that shapes the jobs inside it. And yet the top of each ladder does the same kind of work.

Visual 6
Two careers, same work at the top
The engineering career and the finance/PE career are two long, different paths. Each shapes a sequence of jobs over years. But at the top of each, the same kind of work lives: systems-thinking work. The brass dashed container marks where that work starts in each ladder.
CAREER A The engineering career CAREER B The finance / PE career Principal · Architect Deciding what questions the team should ask Staff engineer Scoping projects, architecture reviews Senior engineer Reviewing others' code, owning a domain IC coder Writing code, fixing bugs, shipping features Junior / new grad Learning the system, assigned tickets SYSTEMS-THINKING WORK STARTS IN CODE ↑ CLIMBS UPWARD MD · Partner Deciding what the deal actually is Principal Running processes, sourcing, structuring VP Reading others' models, framing questions Associate Building memos, diligence workstreams Analyst Building models, line-by-line diligence SYSTEMS-THINKING WORK STARTS IN MODELS ↑ CLIMBS UPWARD
Two different careers, same work at the top. Each career ladder is its own long sequence of jobs. Inside each, the top portion (marked by the dashed container) is where systems-thinking work lives. Below it is execution. The top portion looks almost identical across these two wildly different worlds.

This isn't specific to tech and finance. The same shape shows up in consulting, product, operations, and creative work. Every serious career has a zone near the top where the work shifts from executing inside systems to specifying them. Visual 6 extends the picture across four domains and adds the question that matters most for the next decade: what happens when AI is dropped on top of all of them at once.

But first, one more distinction worth naming. The top three rungs of a ladder aren't doing the same work. The top two are doing systems-thinking: specifying what gets built, framing the question. The next rung or two down are doing drive-to-execution: reviewing the work, catching errors, directing the people below them, verifying that things actually ship. The bottom is doing tasks: the line-by-line production. Three distinct cognitive zones. And AI interacts with each one differently.

Visual 7
AI eats the tasks at the bottom of every ladder
Four careers. Three zones in each: systems-thinking at the top, drive-to-execution in the middle, tasks at the bottom. Toggle between before and under AI pressure to see how the three zones change, and what happens to a systems-thinker who let the drive-to-execution muscle atrophy.
Part A · The three zones across four careers
Part B · Two muscles, one zone
THE GOLDILOCKS ZONE where both muscles are alive at the same time SYSTEMS- THINKING Knows which way to row DRIVE TO EXECUTION Knows how to row The leader too far from the work Sees the direction, loses touch with ground truth The Associate Knows the stroke, doesn't know which way to go Both muscles alive Sees direction AND feels stroke DURABLE UNDER AI Shortcuts are only trustworthy when someone knows the long way
The job lives in the overlap. AI can row now, and it can help audit the stroke. But the shortcut is only trustworthy if someone in the loop knows the long way well enough to tell when it preserved what mattered and when it only produced something plausible. That verification work, where systems-thinking meets drive-to-execution, is what doesn't get replaced. Stay in the overlap.

Think of a boat. An associate knows how to row but doesn't know which direction to go. Perfect stroke, wrong ocean. A senior leader who has drifted too far from the work still sees the direction, but can lose the feel for whether the boat is moving cleanly or just looking busy.

Neither alone is what the job needs. And AI makes this sharper, not gentler: AI can row now, and it can do a decent job of pointing roughly north. It can even help audit the stroke. But the shortcut is only trustworthy if someone in the loop knows the long way well enough to tell when the motion looks right while the boat is still off course. That verification work lives in the overlap, where knowing-the-direction and feeling-the-stroke are both alive in the same person.

This is also why taste and judgment sit at the top of every ladder. A senior person's real edge isn't just seeing the direction, it's knowing whether a particular deal is any good, whether a particular deck is any good, whether a particular architecture is any good. That judgment was built through years of reviewing models and running diligence, reading a thousand bad decks before writing a great one. Taste doesn't come from the systems-thinking muscle alone. It comes from staying close enough to execution to keep your judgment calibrated.

The jobs at the top of every ladder demand that overlap, and the taste and judgment that grow inside it. Staff engineers do systems-thinking work and still read pull requests. PE principals frame the deal and still check the working capital math. The best founders decide what to build and still talk to customers themselves. The consulting partners worth their fees still review the deck before it goes to the client. So do the rare product managers and the generals who actually win. The cognitive shape is nearly identical: both muscles alive at once, and the taste and judgment that only come from keeping them warm over time.

Why this work just got ten times more valuable

Everything that lives purely in the task zone is becoming automatable to some degree. Analyst line-by-line work, IC engineering tickets, consulting slide production, junior PM launches. Tasks by definition, the bounded operations that live inside a defined system, are exactly what AI is best at.

What doesn't compress is the work of defining the system and the work of driving it through to real execution. The first requires reach: holding many things in mind at once and noticing which rhyme, a broad pre-training, years of absorbing how different kinds of systems break. The second requires having done the work yourself, knowing what "good" actually feels like, being able to smell when something is off even when the surface looks clean.

Systems-thinking plus drive-to-execution is what demands a large repertoire. A narrow model, or a narrow person, can operate inside a specified system beautifully, but can't specify the system. Can review the surface of someone else's work, but can't verify the real quality. The specification work is the part that needs the golf swing and the manufacturing floor and the org dynamics and the pricing intuition all available in the same head at the same time, ready to fire when the situation asks. The execution-driving work is what makes sure the right thing actually happens.

A warning that's easy to miss

This does not mean "stop doing the work and become a manager." And it also doesn't mean "get promoted and drift too far from ground truth."

The person who thrives at the top isn't someone who skipped the ladder, and it isn't someone who climbed the ladder and then checked out from the work below them. It's someone who climbed far enough that the systems became visible from where they were standing, and kept their hands close enough to the work that they can still feel when something is wrong. The engineer who learned to see systems by building them, and still reviews code. The VP who learned to see operations by running them, and still walks the floor. The top rung earned its right to be occupied by the repertoire that came before, and stays occupied by keeping the drive-to-execution muscle warm.

Two failure modes. The first: an under-calibrated manager is someone put into systems-thinking work without the repertoire to fill it. They give directions without understanding what they're directing. They ask for outcomes without seeing the joints. In an AI-heavy world, they become exposed faster than the people who are still close enough to the work to notice what's wrong.

The second, more subtle: a senior person who had the repertoire but let the drive-to-execution muscle atrophy. Ten years of outsourcing diligence to juniors, ten years of only showing up for the strategic question. They can still sound like a systems-thinker on paper. But in an AI-augmented world, judgment without calibration gets more exposed, not less. AI can now do more of what their teams used to do, which means the senior person's edge depends even more on staying close enough to the work to verify what matters. When that contact fades, taste dulls, systems-thinking floats upward, and confidence can outrun calibration.

So aim for the zone

Here's the move I'd argue for, wherever you are on whatever ladder. Orient toward the systems-thinking zone. Not by skipping the work below, that's the failure mode above. By doing the work below with one eye on how the system you're inside actually works. What would break if you changed it. What questions never get asked. Where the joints are.

The analyst who climbs fastest is the one who, while building the model, is already tracking which questions the model is silently assuming the answer to. The junior engineer who climbs fastest is the one who, while shipping the feature, is noticing the architectural decision buried inside it. The consultant who climbs fastest is the one who, while building the deck, is seeing the pattern their manager is using to frame the whole engagement.

That habit, doing the work while seeing the system it's inside, is how the repertoire turns into systems-thinking capacity. Everyone should be training toward it, even if the move takes a decade. The durable work in the next decade is up there. Earn the right to occupy it.

And the reward for doing this is growing, not shrinking.

The same shift that compresses the old task list is opening up entirely new businesses, entirely new kinds of analysis, entirely new roles that didn't exist before because they were uneconomic to build.

Those are the places where systems-thinkers, at any level of seniority, get to build things the previous generation simply couldn't. A junior analyst orchestrating a swarm of AI agents to run analyses that would have required a whole consulting engagement in 2020. A senior engineer specifying a product that one person now builds end-to-end. A junior consultant running diligence workstreams that used to take a team of five. The zone isn't disappearing; it's proliferating.

Back to the quiet neurons

Most of your neurons are quiet right now. A few percent, call it something like 2 to 10 percent depending on how you count, are doing the work of this moment. The rest will probably stay quiet for the rest of this sentence. And yet sometime in the next hour some tiny fraction of them will fire in response to something you haven't thought about in years, and it will rhyme with whatever you're working on, and you'll call that a good idea.

That's the whole game. In brains. In models. In careers.

The value lives in what you absorbed when it didn't seem useful, and in the reach that lets you find it again. Pure task-doers get squeezed first. The systems-thinkers, doing the work of specifying rather than executing, occupy positions that only the full repertoire can fill.

Go build a large one while you still can.


Part of a series

Three essays, three scales

One argument, expressed at three altitudes: The Principal and the Swarm at 20 feet, Reach at 2,000 feet, and The New Stable Orbits at 20,000 feet.

Read them together if you want the full arc: what gets built, the patterns underneath it, and the structural shifts that follow.

The underlying ideas here borrow from sparse activation in cortical networks, mixture-of-experts architectures, and the pre-train/post-train distinction in modern ML. The synthesis, that scope of work is bottlenecked by repertoire size, and that the same broad mechanism rhymes across brains, models, and careers, is the individual-scale piece of a longer argument about what the next decade of work rewards.