i've run a stack of models across a single 3090, a 5090, and a 128GB DGX Spark. exactly three are worth building on. the honest list.
the three worth it:
> 1. StepFun Step-3.5 Flash, the REAP pruned 121B MoE (Q6, DGX Spark) a 121 billion parameter mixture of experts running on a single desktop box. the most worth-it model in everything i've tested.
> 2. Qwen 3.6 27B Dense, Q4 (single RTX 3090) the undisputed king of the 24GB tier. one shot a playable game, around 41 tok/s, fits with context headroom to spare. one 24GB card, this is your answer.
> 3. NVIDIA Nemotron 3 Nano Omni, 30B-A3B (DGX Spark) the best multimodal i've tested for video classification work. vision in, runs clean on the Spark.
the rest, ran them, they hold up fine:
on the Spark: DeepSeek V4 Flash 158B,
GLM 4.7 Flash, GLM 4.5 Air REAP 82B-A12B, Gemma 4 26B-A4B, Qwen3-VL 235B-A22B, Qwen3 Coder 30B-A3B, Qwen3 30B-A3B, Carnice 35B-A3B.
on consumer GPUs:
Kimi K2.5 1T, Qwen3-Coder-Next 80B, Hermes 4.3 36B, Qwen 3.5 27B Dense.
single 3090 to a 128GB Spark, that's the range. the three up top are the ones worth your hardware today.
显示更多