I didn't expect DeepSeek v4 PRO (not Flash) to run well on the Mac Studio M3 Ultra with 512GB of RAM. This is 2 bit quantized with the same DwarfStar recipe used for Flash. 433GB GGUF file. 130 t/s prefill, 13 t/s generation. Prefill in the video is low because small prompt.
显示更多