NVIDIA AI
@NVIDIAAI
Teaching your AI new tricks.
Joined June 2016
853 Following    290.6K Followers
Love it. Well done.
Google dropped MTP versions of Gemma4. Ran them on my DGX Spark. The 31B dense model went from 3.94 → 8.91 tok/s. That's +126%. Full results: [26B A4B] > 25.24 → 31.69 tok/s (+25.6%) > TTFT 755 → 332ms (-56%) [31B] > 3.94 → 8.91 tok/s (+126%) > TTFT 599 → 378ms (-37%) If you're not running MTP, you're leaving free perf on the table.
Show more