Register and share your invite link to earn from video plays and referrals.

Bearly AI
@bearlyai
Privacy-first AI research tool with access to ChatGPT, Grok, Claude, Gemini and DeepSeek — all in one app (by @pnegahdar and @trungtphan)
Joined October 2022
2 Following    23.8K Followers
Jensen called this “probably the single most important chart for the future of AI factories”. Y-axis is “Throughput” (total volume) while X-axis is “Token Speed” (more tokens per second = more interactivity for a user + more context + more reasoning). Cerebras IPO very much about the chart. Firms market and price token offerings on those two variables, which are in tension. A free tier typically is high throughput but lower token speed. Meanwhile, the priciest tier would have lower througput but high-value tokens (eg. research, coding) SemiAnalysis makes the analogy of a “bus vs a Ferrari: you can choose to serve lots of users slowly, a single user quickly, or anything in between.” Nvidia’s challenge is to build systems that lift the entire line up and to the right. Jensen says Vera Rubin architecture improves revenue opportunity 5x vs. Blackwell. Then, if you add Groq to Vera Rubin, that revenue opportunity is up 10x vs. Blackwell. Groq is Nvidia’s option for delivering the higher value tokens at speed, which is the same market that Cerebras is targeting. Cerebras is attacking the problem with a massive, single-wafer design. Meanwhile, Groq uses multiple, smaller, connected chips and a specialized processor architecture design (Language Processing Unit aka LPUs).
Show more