注册并分享邀请链接,可获得视频播放与邀请奖励。

Zach Mueller
@TheZachMueller
Head of Dev Rel at @LambdaAPI. Hardware nerd. Usually yelling at NCCL over things. Posts are my own.
加入 April 2016
808 正在关注    16K 粉丝
PinchBench results for Qwen3.5 27B using @UnslothAI K_XL quants, best of 3, thinking enabled. TL;DR: Q3 KXL (14.5GB) or Q4 KXL (18GB) While overall the "best" results showed little degradation, if you dig into mean/std Q4_K_XL overall was the best at ~84% on average. Q3 seems viable, while Q2 is the the lowest performing, of course.
显示更多
0
19
207
24
转发到社区