註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Yiran
@yiran2037840
PhD @ On-device AI & AI Infra | Investor | Not Financial Advice
加入 April 2026
77 正在關注    1.7K 粉絲
NVIDIA GB300 Ultra NVL72在vLLM推理引擎上的真实端到端性能达到了GB200 NVL72的2.7倍,远超硬件规格上纸面约1.5倍的FLOP和内存提升。 这种提升,本质上来自全栈AI infra的复合优化。在实际AI服务负载(throughput vs interactivity的中段曲线)下,产生1+1>2的放大效应。
顯示更多
MINECRAFT STEVE ALERT: GB300 ultra NVL72 is already 2.7x faster 🚀 than GB200 NVL72 on one of the industry standard inference engine known as @vllm_project. On paper, GB300 only has ~1.5x faster NVFP4 FLOP & 1.5x more HBM capacity & same HBM BW than GB200 but due to the full stack optimization with compounding gains, in the middle of the curve where most providers serve at, GB300 is up to 2.7x faster. End to End performance is the gold standard of performance, not on paper theoretical flops. Thanks to the 10x engineers at NVIDIA & @inferact & @coreweave for this temporary gb300 for open source projects!
顯示更多