Zach Mueller(@TheZachMueller):PinchBench results for Qwen3.5 27B using @UnslothAI K_XL quants, best of 3, thinking enabled. TL;DR: Q3 KXL (14.5GB) or Q4 KXL (18GB) While overall the "best" results showed little degradation, if you dig into mean/std Q4_K_XL overall was the best at ~84% on average. Q3 seems viable, while Q2 is the the lowest performing, of course.

Zach Mueller

@TheZachMueller

Head of Dev Rel at @LambdaAPI. Hardware nerd. Usually yelling at NCCL over things. Posts are my own.

加入 April 2016

808 正在关注 16K 粉丝

Zach Mueller@TheZachMueller

2026.03.21 19:46

PinchBench results for Qwen3.5 27B using @UnslothAI K_XL quants, best of 3, thinking enabled. TL;DR: Q3 KXL (14.5GB) or Q4 KXL (18GB) While overall the "best" results showed little degradation, if you dig into mean/std Q4_K_XL overall was the best at ~84% on average. Q3 seems viable, while Q2 is the the lowest performing, of course.

显示更多