Akshay 🚀(@akshay_pachaar ):The MCP vs CLI debate. For most of 2025, AI Engineers argued about it. The skeptics had real numbers: - Playwright MCP eats 13.7K tokens - Chrome DevTools MCP eats 18K - A 5-server setup burns 55K tokens before any work The defenders pushed back: - CLIs break on multi-tenant apps - No typed contracts, so the agent guesses at outputs - On unfamiliar APIs, agents waste turns parsing text Both sides were arguing about the wrong thing. In November 2025, Anthropic published "Code execution with MCP" and reframed it from first principles. The problem was never the protocol. It was the habit of dumping every tool's full description into the model's context the moment a session starts. Add the data those tools return, passed through the model on every step, and a single workflow can balloon to 150K tokens. Most of which the model never needed. The fix is to flip the model's job. Instead of the model calling tools through its context, the model writes code that calls tools through a runtime. The runtime is where tools live. The model only sees what it imports. In Anthropic's example, a Google Drive transcript flows into a Salesforce CRM update. The old way loaded both tool schemas and piped the entire transcript through the model twice. The new way is ten lines of TypeScript that import what they need. Same task, 2K tokens. A 98.7% drop. Cloudflare pushed the idea to its limit. They collapsed their entire 2,500-endpoint API from 1.17M tokens of schemas down to 1K tokens, by exposing just two functions: search and execute. The agent writes code that searches the catalog, then executes only what matches. The new pattern has a name: Code Mode. It is a runtime where the agent writes code that mixes two primitives. Bash, for anything with a binary already installed like git or curl. Typed module imports, for proprietary APIs where the type signatures load only when the agent actually imports the tool. That second part is the unlock. Types travel with imports, so the agent gets a strict contract for the tools it picks, and pays nothing for the ones it skips. MCP's typed contracts plus CLI's lazy loading, in one runtime. The agent picks per task. "MCP is dead" was the wrong takeaway. Anthropic just reported 300M MCP SDK downloads, up from 100M at the start of the year. The protocol is not dying. It is the fastest growing piece of agent infrastructure right now. What died was loading every tool upfront. That was always a bad idea. If you are building agents in 2026, the rule is simple. Tool definitions belong in code, not in context. The model writes a few lines that call them. The runtime does the rest. That is what the debate was actually about.

2026.05.10 07:44

The MCP vs CLI debate. For most of 2025, AI Engineers argued about it. The skeptics had real numbers: - Playwright MCP eats 13.7K tokens - Chrome DevTools MCP eats 18K - A 5-server setup burns 55K tokens before any work The defenders pushed back: - CLIs break on multi-tenant apps - No typed contracts, so the agent guesses at outputs - On unfamiliar APIs, agents waste turns parsing text Both sides were arguing about the wrong thing. In November 2025, Anthropic published "Code execution with MCP" and reframed it from first principles. The problem was never the protocol. It was the habit of dumping every tool's full description into the model's context the moment a session starts. Add the data those tools return, passed through the model on every step, and a single workflow can balloon to 150K tokens. Most of which the model never needed. The fix is to flip the model's job. Instead of the model calling tools through its context, the model writes code that calls tools through a runtime. The runtime is where tools live. The model only sees what it imports. In Anthropic's example, a Google Drive transcript flows into a Salesforce CRM update. The old way loaded both tool schemas and piped the entire transcript through the model twice. The new way is ten lines of TypeScript that import what they need. Same task, 2K tokens. A 98.7% drop. Cloudflare pushed the idea to its limit. They collapsed their entire 2,500-endpoint API from 1.17M tokens of schemas down to 1K tokens, by exposing just two functions: search and execute. The agent writes code that searches the catalog, then executes only what matches. The new pattern has a name: Code Mode. It is a runtime where the agent writes code that mixes two primitives. Bash, for anything with a binary already installed like git or curl. Typed module imports, for proprietary APIs where the type signatures load only when the agent actually imports the tool. That second part is the unlock. Types travel with imports, so the agent gets a strict contract for the tools it picks, and pays nothing for the ones it skips. MCP's typed contracts plus CLI's lazy loading, in one runtime. The agent picks per task. "MCP is dead" was the wrong takeaway. Anthropic just reported 300M MCP SDK downloads, up from 100M at the start of the year. The protocol is not dying. It is the fastest growing piece of agent infrastructure right now. What died was loading every tool upfront. That was always a bad idea. If you are building agents in 2026, the rule is simple. Tool definitions belong in code, not in context. The model writes a few lines that call them. The runtime does the rest. That is what the debate was actually about.

Akshay 🚀@akshay_pachaar

2026.05.09 17:35

476

Forward to community