Michael Guo(@Michaelzsguo):So you bought the 128GB MacBook Pro. Now the question is not, “Which local model gets the highest TPS?” It is: which setup can I actually trust to get the job done? This is the local coding stack I’d start with: Qwen 3.6, dense 27B, Q6 quant, MLX server, 8192 output tokens, 20GB prompt cache, and deterministic decoding. If Anthropic’s success story tells us anything, it is that once you figure out coding, you can expand into almost anything else. Local models stop being a hobby when they can finish the patch.

Michael Guo

@Michaelzsguo

Building AI agents and AI-native orgs. Demystifying AI in practice. EN/中文

Joined January 2022

389 Following 2.3K Followers

Michael Guo@Michaelzsguo

2026.05.17 12:43

So you bought the 128GB MacBook Pro. Now the question is not, “Which local model gets the highest TPS?” It is: which setup can I actually trust to get the job done? This is the local coding stack I’d start with: Qwen 3.6, dense 27B, Q6 quant, MLX server, 8192 output tokens, 20GB prompt cache, and deterministic decoding. If Anthropic’s success story tells us anything, it is that once you figure out coding, you can expand into almost anything else. Local models stop being a hobby when they can finish the patch.