Michael Goin (@mgoin_) walks through @vllm_project v0.20.0.
752 commits. 320 contributors. 123 new. ๐ ๐
DeepSeek V4, TurboQuant 2-bit KV cache, MXFP4 for MoE on Blackwell, FA4 as MLA prefill default, @PyTorch 2.11 + CUDA 13.0, Transformers V5, and a lot more.
~8 minutes.