The most revealing thing about this AI leadership paper is that it reads less like a vision for innovation and more like a glossy whitepaper for a 21st century East India Company.
Every generation of incumbents discovers a new moral vocabulary for why they alone should control transformative technology.
In the 90s it was cryptography. We were told strong encryption was too dangerous to spread because terrorists, rogue states, chaos, dual-use, etc. So the US crippled exports, weakened products, slowed adoption, and kneecapped parts of its own software industry. Right up until reality steamrolled the policy and we woke up to its stupidity and then eCommerce, secure communications, software signing, and the modern internet exploded and gave us tremendous benefits.
Now the exact same priesthood has returned with AI.
- “Dual-use.”
- “Strategic advantage.”
- “Model distillation.”
- “National security.”
- “Responsible access.”
A few different nouns but mostly the same ones. Same instinct:
Centralize control, gatekeep compute, fuse state and corporate power, and call it safety.
The funniest part is that this strategy is almost perfectly designed to accelerate the thing they claim to fear.
You do not stop a rival superpower (who happens to be the absolute best at scaling energy and manufacturing and who has a choke-hold on rare Earths refinement) from building domestic capability by permanently attempting to strangle them.
You create the economic and political incentive for total self-sufficiency.
We have already done that as Jensen warned. We went from 100% market to nearly 0%. Huawei is now manufacturing millions of chips. DeepSeek v4 trained on them. They have more energy than the rest of the world combined. Meanwhile, we have activists and anti-economic fools like AOC and Bernie pushing for data center moratoriums and we can't build a single bullet train in 20 years and folks fighting to not expand the energy grid here and new nuclear plants getting tied up in environmental regulation for a decade.
The sanctions did the exact opposite of what the hawks wanted. They jumpstarted a moribund, dinosaur of a Chinese chips industry. We basically said to the people who happen control the most powerful manufacturing engine on the planet "we intend to squeeze you."
They rightly saw it as an existential threat.
The sanctions become the industrial policy.
Huawei. SMIC. Domestic lithography. Packaging. Memory. Entire Chinese supply chains that did not exist at serious scale a decade ago now exist precisely because Washington convinced Beijing they had no choice.
Brilliant work.
So the endgame here is what exactly?
1) Push China into a Manhattan Project for chips and AI.
2) Increase the strategic value of Taiwan even further.
3) Once China reaches self sufficiency that can invade Taiwan and choke off our own super advanced chips where are made there exclusively (and no we don't have even close to enough TSMC factories in Arizona or anywhere else in the world).
That's every NVIDIA chip. Every Google tensor chip. Every Apple chip. Every chip in you iPhone and Android phone. Every Amazon chip. The chips in your car and truck and hair dryer and washing machine.
4) Escalate a cold tech war into a permanent civilizational bloc conflict that is likely to turn into a shooting war at one point.
5) Fragment the global software ecosystem.
6) Create American AI aristocracies protected by regulation and compute licensing.
And somehow call this “open innovation.”
Meanwhile the actual history of software keeps screaming the opposite lesson:
Knowledge diffuses, open ecosystems win, developers route around gatekeepers, and attempts to permanently contain computation usually fail.
What really jumps off the page is the assumption that a tiny cluster of frontier labs should become quasi-sovereign actors, deciding who gets intelligence, who gets compute, who gets models, and which countries are permitted to participate in the future.
Not elected governments.
Not open markets.
Not open-source communities.
A handful of corporations sitting beside the national security state, insisting that concentration of power is necessary to protect democracy.
You almost have to admire the audacity.
Show more
The degree of projection in this paper is genuinely insane.
Your regular reminder that Anthropic was the first AI frontier lab to actively work with the Pentagon and US intelligence agencies to help them with their global surveillance program.
Back in 2024 already, they partnered with Palantir "to make Anthropic's Claude models available to U.S. intelligence and defense agencies" (
And remember the whole story earlier this year on how they refused that their AI be used for “mass domestic surveillance”? Notice something there?
By definition it means that they agree with *NON-domestic* mass surveillance, meaning Anthropic has absolutely no problem with the U.S. military-industrial complex using their AI to surveil all 8 billion inhabitants on Earth, provided it excludes the 340 million Americans. And even the latter can be surveilled, just not in a “mass” way (whatever that means).
Which, incidentally, is actually merely a restatement of U.S. law. Mass domestic surveillance of Americans is prohibited anyhow by the Fourth Amendment, and mass foreign surveillance is authorized under FISA Section 702 and Executive Order 12333 - the legal architecture Edward Snowden exposed in 2013.
So Anthropic’s so-called “principled” stance is them actually supporting very same legal architecture that, back when the Snowden revelations broke in 2013, was rightly condemned as the most sweeping surveillance regime in the world (which it factually is).
Which means that them warning that China's AI may be used for surveillance, and as such is dangerous, is literally them taking everyone for complete idiots.
All the more due to the fact that, because China's AI is largely open source, you can use it in a way where you keep complete control of your data - unlike Anthropic 🤷♂️
Show more
Today I’m launching AI IQ — frontier AI models, scored on the human IQ scale.
Instead of endless leaderboard tables, AI IQ shows:
• Where models land on the IQ bell curve
• How frontier IQ is changing over time
• How models compare on IQ and EQ
• What intelligence costs in practice
GPT-5.5, Claude Opus 4.7, Gemini 3.1, Grok 4.3, Kimi K2.6, Qwen3.6, DeepSeek V4, Muse Spark, and more.
Link in the first reply. Curious which chart surprises you most.
Show more
This ( is, by any measure, an extraordinary article: Prince Turki Al-Faisal is a son of King Faisal and ran Saudi intelligence (the GID) for over two decades.
He is writing that the plan of "the US-Israeli war on Iran" was "to ignite war between us [Saudi Arabia] and Iran," so that Israel could "impose its will on the region and remained the only actor in our surroundings."
This further confirms that, contrary to what many have asserted, the notion that the Saudis were quietly backing the war on Iran was a myth (alongside the recent fact the Saudis denied the U.S. access to its bases and airspace:
From the horse's mouth they're literally saying it was as much a war on them as it was on Iran!
Pretty crazy when you think about it: this is Saudi Arabia saying that their real enemy in this war was the U.S. and Israel. Hard to overstate how significant a rupture this represents.
Now of course they could be saying so because, seeing how the war turned out, they're trying to retroactively position themselves on the winning side (at least strategically, by saying they didn't take the bait), or trying to justify domestically why they absorbed hits from Iran without retaliating.
And, of course, it's not like they're presenting Iran as some sort of ally here: Prince Turki explicitly calls them a "neighbor" that caused "pains."
But still, the end result remains: the Saudi establishment is now committing, on the record and in plain language, to a framing in which, while Iran is a "painful neighbor", the U.S. and Israel represent the deeper strategic threat, having tried to engineer their destruction.
If you had any lingering doubt that this war accelerated the collapse of U.S. influence in the region, this should settle it.
Show more
An early Claude Mythos Preview snapshot we provided METR has a time horizon of more than 2x the next best model on their 80% success rate benchmark
Every year, this has to be the one report I look forward to the most: the Democracy Perception Index, compiled by the Alliance of Democracies Foundation (in partnership with Nita Data).
In fact, my yearly thread on the report is apparently such a tradition that, this year, its lead researcher personally sent me the report with this message: "every year, I look forward to your thread about it!". That's how you start wondering whether you tweet too much 😅
Why do I like this report so much? A few reasons:
1) The Alliance of Democracies Foundation, the organization behind the report, cannot even remotely be suspected of being some sort of anti-West outlet: it was started by an ex-NATO Secretary General (Anders Fogh Rasmussen) and its stated purpose is "to unite world democracies"
2) It's surprisingly honest and the methodology is actually democratic. Unlike other reports on democracy the scoring isn't done by the report's authors (like the report by Freedom House or The Economist's "Democracy Index"). It simply asks people what they think and, when it comes to democracy, that's kind of the point 🤷♂️
3) I love the expression "perception is reality" because, like it or not, what people believe about their system is what determines its legitimacy. A democracy that nobody actually experiences as one can't credibly claim to be one. And conversely, a so-called "autocracy" that its people overwhelmingly believe is actually a democracy might... actually be a democracy.
Anyhow, this year's edition did not disappoint. The data is absolutely fascinating and frankly, a little terrifying. So here you go: my thread on the 2026 Democracy Perception Index 🧵
Show more
One of the things that made the Mythos release hard to interpret is that Anthropic held back details on most vulns they found, to give defenders time to patch.
1 month later, info from orgs with access to Mythos is starting to trickle out, e.g. this post from Mozilla today:
Show more
Since GPT-4o, frontier average scores on METR-Horizon have been remarkably predictable over time.
A simple linear fit of average score vs. release date gives R² = 0.984.
The relationship between average score and log time horizon is also extremely strong:
- p50 horizon: r = 0.998
- p80 horizon: r = 0.992
Claude Mythos scored 85.21%, slightly above the ~83.3% predicted by the pre-Mythos linear trend.
The implied doubling time for METR time horizons is still about 103 days, the same value we reported on February 12th, 2026.
If current trends continue:
- 90% score: July 7, 2026
- implied p50 horizon: 27.5 hours
- implied p80 horizon: 4.8 hours
- 95% score: September 18, 2026
- implied p50 horizon: 44.9 hours, or 1.9 days
- implied p80 horizon: 7.8 hours
- 100% score: November 30, 2026
- implied p50 horizon: 73.4 hours, or 3.1 days
- implied p80 horizon: 12.8 hours
Show more
People keep confusing a bubble with “stocks go up and get overvalued”. A bubble is when when a prevailing trend and a prevailing misconception about that trend interact reflexively, each reinforcing the other until the gap between perception and reality becomes unsustainable.
A bubble is not when everyone realizes that right now every iota of AI demand eventually, at some point upstream, must move through memory OEMs. Nor is it when estimates continue rising because things are better than expected. And it’s not just when stocks trade expensive to historical valuations.
The reason behind the moves in the AI infrastructure layer so far have been simply that we don’t have enough. They’ve been driven by the fundamental reality more than the perception of the future. It’s why the bulk of the most bullish parts of this cycle have been lumpy and centered around earnings season when companies uniformly come out and confirm there’s still not enough. In the bubble, the reality is driven by the market - not the other way around.
Everyone keeps saying “people are gonna freak out if it’s not a bubble!”. I think that’s silly, we have a transformative new technology that needs crazy capital to fuel it coming to fruition, that has and always will result in a bubble as long as we have financial markets.
But if you want to call the top in a bubble, you need a much stronger view on what the misconception is and what negative catalyst forces broad perception to align with realizing it than you do on valuation.
Show more
We evaluated an early version of Claude Mythos Preview for risk assessment during a limited window in March 2026. We estimated a 50%-time-horizon of at least 16hrs (95% CI 8.5hrs to 55hrs) on our task suite, at the upper end of what we can measure without new tasks.
Show more
Alignment is still a very young field, and it has been attacked from few angles. The next big idea may come from importing the right tool from the right discipline.
Post with
@edmund_lau80313,
@CameronHolmes92, and
@geoffreyirving, about bringing expertise to bear on alignment.
Show more
New Anthropic research: Natural Language Autoencoders.
Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read.
Here, we train Claude to translate its activations into human-readable text.
Show more
Neural networks might speak English, but they think in shapes.
Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision.
Starting today, we’re releasing a series of posts on this research agenda. 🧵
Show more
My team at
@GoodfireAI has been cooking up a new way to do interpretability: decompose a language model’s weights, not its activations.
Our decomposition natively handles attention (!) and behaves less like a lookup table and more like a generalizing algorithm. (1/6)
Show more
Every Democrat since Carter has decreased the federal deficit.
Every Republican since Eisenhower has increased the federal deficit.
I read Goldman Sachs’ AI report, and I was genuinely impressed.
The core insight is as follows:
Agentic AI could turn AI from a capex-heavy cost burden into a business where usage growth drives margin expansion. As token costs fall, more complex agents become economically viable. These agents then consume far more tokens through longer context windows, repeated reasoning loops, validation, tool use, and always-on background monitoring.
This increase in token usage improves infrastructure utilization, strengthens unit economics, and gives hyperscalers and model providers more room to reinvest in model quality, distribution, and capacity.
In other words, the bull case for AI capex is not simply that usage will grow. It is that this usage growth can increasingly flow through at attractive incremental margins. Goldman Sachs argues that this margin inflection is beginning to appear from 2026 onward.
Show more
Excited to share new research with Jon Kutasov,
@saprmarks,
@sprice354_: Model Spec Midtraining (MSM)
The Model Spec sets out how AIs should behave and why. MSM trains AIs on documents about the spec. This can improve how AIs generalize from subsequent alignment training.
Show more
When the Iran war started I wrote - in response to claims it was all about cutting China's energy supply - that, on the contrary, it'd prove to be the best advertising campaign ever for China's green energy platform (my article:
It's now a fact 👇
Show more
Two reflections on my IKP sanity check with
@justanotherlaw. Viral papers are cheaper than ever to produce thanks to AI agents. Thankfully, AI agents also make checking viral claims easier than ever.