The Underpriced Truth: Agentic AI Is a Paradigm Shift Centered on Memory
1/ The market will slowly realize:
Agentic AI is memory-centric, not compute-centric.
The new hardware stack is: ① Memory — HBM / DRAM / NAND ② Parallel compute — GPU / ASIC ③ Coordinator — CPU
CPUs stopped doing the heavy lifting a long time ago. This isn't a cycle. It's a paradigm. 🧵👇
2/ First principles
Humanity's ultimate pursuit of intelligence has always been two things: Infinite memory + infinite compute.
When we say someone is smart, we mean two things: "good memory" + "fast thinking."
Machine intelligence is walking the exact same path.
3/ The story the market already understands: HBM
LLM inference's decode stage is a textbook memory-bound workload.
Every token generated → drag the entire KV cache across memory. Bandwidth too low → expensive GPUs sit idle.
That's why every new GPU generation ships with more HBM bandwidth and capacity.
4/ The story the market is missing
The "1M context" you keep hearing about?
It is not assembled inside the GPU inference cluster.
So where is it actually built?
5/ It's built on the traditional servers running the agentic system
Those CPU + huge-DRAM servers are quietly doing the heaviest lifting:
• loading user long-term & short-term memory • loading the agent's system spec / prompt • loading skill / tool / subagent definitions • compressing the context once it overflows 1M tokens
All of this lives in DRAM, not HBM.
6/ Compare this to the previous era
In the web / mobile era, we barely stored any user context at all.
Only search / recsys / ads kept a small user profile — maybe 1/20, even 1/100 of the data volume an agentic system needs today.
That asymmetry is the real overlooked inflection point.
7/ The supply chain is already telling this story
Server CPU : DRAM ratio is climbing fast: • Web / Mobile era: 1 core : 4 GB • Agentic AI today: 1 core : 16 GB • Deep agentic future: 1 core : 64 GB and beyond
8/ And it's NOT just "4x more memory"
Under agentic workloads, a single CPU serves a fraction of the users it used to.
When the entire IT stack migrates to agentic: • CPU count grows several-fold to ~10x • DRAM total grows tens-fold to ~100x
That's the part nobody is pricing in.
9/ The conclusion
Agentic AI is a paradigm shift centered on storage + parallel compute.
The software paradigm changed. The hardware paradigm changed with it.
Only those who deeply understand the technology will see it:
This isn't a memory cycle. It's a memory paradigm.
10/ Time horizon
Given how early we still are on: • user adoption rate • depth of usage per user
We are at least 5 years away from the cyclical top of this memory wave.
(Zoom out far enough and everything is a cycle — but this one is nowhere near peak.)
$MU $DRAM $SNDK
顯示更多