I successfully got mlx-node running in the browser! I've implemented a WebGPU backend that can run Qwen3.5 0.8b.
Currently, it's still full precision bf16 (f32 in WebGPU) and hasn't undergone any optimizations; it just runs as is, but it looks like there could be many interesting things to do in the future!