I can't believe this works, but I got DeepSeek-V4-Flash (284B params) running on a Raspberry Pi 5 (8GB edition) at >1tok/s @ ~8W during full-tilt inference! It uses an untouched copy of
@antirez's GGUF. Took 160+ experiments over 5 days between GPT-5.5 xhigh and Opus 4.8 max.