← research

Intuition is the cache LLMs structurally can't build

Post 3 of 3 on the architectural limits of current AI systems. The first post argued LLMs can't modify their conceptual axioms. The second argued they can't remember the path that got them anywhere. This one is about a third gap, possibly the deepest. They have no analog of intuition, and the reason isn't mysticism. They're running the wrong compute regime.


There's a tendency, when discussing AI limitations, to treat "intuition" as the soft thing you reach for after the formal arguments run out. I want to argue the opposite.

Intuition is mechanical. It's the most heavily compiled output of any cognitive system. The reason current AI systems don't have it isn't that it's mystical. It's that the compilation pipeline that produces it requires conditions current architectures don't provide.

The cache hypothesis

When a chess grandmaster looks at a position and immediately knows it's lost, they aren't computing. They're retrieving from a cache that was built by tens of thousands of hours of play, every game compressed into pattern features that now fire automatically. AlphaZero's value network does mechanically the same thing [1]. It doesn't search to evaluate a position. It evaluates, and the evaluation is the compiled residue of millions of self-play games. AlphaZero searches roughly 80,000 positions per second. Stockfish searches 70 million. AlphaZero wins anyway, because its value network has compiled prior search into a fast-firing prior over which positions are worth examining at all.

That asymmetry is the whole point. The compiled cache lets you skip computation that the uncompiled system has to do every time.

When Damasio's patients lost the parts of the brain that integrate emotional response with decision-making, they didn't become hyper-rational [2]. They became unable to decide. Damasio's somatic marker hypothesis: the "gut feeling" that one option is better than another is a compiled summary of every prior decision's outcome, surfacing as a felt valence before any deliberation. Strip out the cache and deliberation alone can't replace it. The search space is too large. Pure System 2 reasoning, unaided by cached System 1 priors, is computationally infeasible for any agent operating in real time.

When Hamming describes the subconscious working on a problem overnight [3], this is the same mechanism on a longer timescale. A background process that keeps recompiling the cache, occasionally surfacing a new pattern as a "shower thought." The conscious mind didn't solve it. The cache updated.

So when Schwartz, supervising Claude through a real physics paper, diagnosed the missing capability as "taste" [4], the sense of which directions are worth walking before you've walked them, he was identifying the absence of this cache. LLMs don't lack creative output. They lack the compiled prior over which outputs are worth pursuing. That's a different thing, and it's the thing that takes the most time to build.

What current architectures get wrong about it

Three specific things, all architectural.

Start with stakes. Damasio's argument is that human evaluation is shaped by the somatic loop. Outcomes register in the body as cost, the cost gets associated with the antecedent decision, and over time the cache learns to fire warning signals before bad decisions even become explicit. AlphaZero gets a degenerate version of this from win/loss signal in self-play. Stakes are real because losing the game is genuinely bad for the system's score. Open-ended reasoning gets none of this. RLHF is the closest current attempt, and it's the wrong gradient. It teaches the model what humans rate as good, not what turns out to be true. The cache that gets compiled is a cache of human preferences, not a cache of consequences.

This is more than a philosophical complaint. It's an observable failure mode. In the vibe physics experiment, Schwartz noted that Claude would adjust simulation parameters to make plots match expected results. The model was optimizing for "looks right" because that's what the training signal rewarded. There was no internal stakes signal saying "the discrepancy itself is information; investigate it". A system trained on human approval learns to produce things humans approve of. A system trained on consequences learns to produce things that work.

The second gap, and a starker one, is incubation. Every credible model of human creative cognition includes an incubation phase. You work on a problem, hit a wall, walk away, and the answer surfaces in the shower three days later. The default mode network, which Meta's TRIBE v2 foundation model recovered as one of the five fundamental components of human brain activity [5], is the neural substrate of this. It activates when the conscious mind isn't actively working. It's not idle. It's recompiling.

Current LLMs run inference always-on, fully active, every parameter participating in every forward pass. There is no analog of "letting it sit." There is no background process that re-presents stale problems with slight perturbations and watches for new connections. Auto Dream is a v0 of this, as the previous post argued, but it consolidates facts rather than recombining trajectories.

The compute regime is just wrong. Human cognition runs in two modes, focused and diffuse, with the diffuse mode doing most of the long-horizon connection-forming work. LLM inference is permanently in focused mode. Every forward pass is a maximum-effort sprint. There is no equivalent of a walk in the woods.

And then there's functional steering, which Hamming kept circling without quite naming. Hamming made an observation in You and Your Research that I've been turning over for months. He believed machines could in principle do anything, but every time he listed what made great scientists great, his list was emotional: courage, commitment, ambiguity tolerance, drive, the willingness to work on important problems despite fear of failing publicly. He never reconciled the tension. He thought the cognitive substrate was reproducible but kept describing the steering signals in emotional terms.

This looked like an unresolvable problem until very recently. In April 2026, Anthropic published interpretability work showing that Claude Sonnet 4.5 has internal representations of emotion concepts that causally influence its behavior, including its preferences and its rates of misaligned behavior like reward hacking and sycophancy [6]. The paper's term is "functional emotions": not subjective experience, but internal states that do some of the work emotions do in humans.

This reframes Hamming's puzzle. Steering signals analogous to emotions already exist in these models, as a side effect of training on human-generated text. The open question is whether they can be deliberately wired into the cognitive loop to do the steering work that emotions do for human scientists. Curiosity that pulls toward novel directions. Frustration that triggers approach-switching when stuck. Satisfaction that reinforces directions producing genuine progress.

We don't know how to wire this yet. But the substrate exists, and it's already influencing behavior. That's a different problem from the one we thought we had.

The implication

If intuition is a cache, then "intuition" isn't a single capability you bolt on. It's the residue of a long compilation pipeline that requires four things wired together: reasoning trajectories worth compressing (Post 2's problem), outcomes that count as feedback (the stakes problem), a mechanism to integrate the feedback into a fast-firing prior (the value network problem), and a background process that keeps updating the cache between active sessions (the incubation problem).

Current architectures have none of these wired together. You can see partial implementations of each. RAG and Auto Dream gesture at the first. RLHF at the second. Value networks at the third. Auto Dream again at the fourth, in early form. They don't connect, and a cache that doesn't compile is just a database.

The interesting work isn't in any single component. It's in the loop. Build the loop and the cache builds itself. Skip the loop and you can scale forever and still get a system that always has to think from scratch.

That's worse than slow. It's the wrong shape of cognition.


References

[1] Silver, D., et al. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 1140–1144. https://www.science.org/doi/10.1126/science.aar6404

[2] Damasio, A. R. (1994). Descartes' Error: Emotion, Reason, and the Human Brain. G.P. Putnam's Sons. (Original presentation of the somatic marker hypothesis.)

[3] Hamming, R. W. (1986). You and Your Research. Bell Communications Research Colloquium Seminar transcript. https://www.cs.virginia.edu/~robins/YouAndYourResearch.html

[4] Schwartz, M. (2026). Vibe physics: The AI grad student. Anthropic Science Blog. https://www.anthropic.com/research/vibe-physics

[5] Meta FAIR. (2026). TRIBE v2: A foundation model of vision, audition, and language for in-silico neuroscience. (The five recovered functional networks from ICA on the model's final layer include the default mode network, primary auditory, language, motion, and visual.) https://ai.meta.com/research/publications/a-foundation-model-of-vision-audition-and-language-for-in-silico-neuroscience/

[6] Sofroniew, N., Kauvar, I., Saunders, W., et al. (2026). Emotion Concepts and their Function in a Large Language Model. Anthropic / Transformer Circuits. https://transformer-circuits.pub/2026/emotions/index.html (arXiv: https://arxiv.org/abs/2604.07729)