stevibe(@stevibe ):2.3x faster. Ran @UnslothAI Qwen3.6 MTP variants on a DGX Spark (UD-Q6_K_XL): > 27B → 27B MTP: 8.1 → 18.65 t/s (2.3x faster) > 35B A3B → 35B A3B MTP: 56.91 → 66.52 t/s (+17%) The 27B dense model more than doubled throughput from MTP alone. Free speed is free speed.

stevibe

@stevibe

LLM. Local AI addict. Building @BenchLocalApp Builds things nobody asked for. Benchmarks things for fun.

Joined July 2009

1.3K Following 21.9K Followers

stevibe@stevibe

2026.05.13 17:14

2.3x faster. Ran @UnslothAI Qwen3.6 MTP variants on a DGX Spark (UD-Q6_K_XL): > 27B → 27B MTP: 8.1 → 18.65 t/s (2.3x faster) > 35B A3B → 35B A3B MTP: 56.91 → 66.52 t/s (+17%) The 27B dense model more than doubled throughput from MTP alone. Free speed is free speed.