GitHub was many people's dream job. Partly because employees were encouraged to speak publicly about the company, we could fly anywhere for conferences and company would cover all costs. IMO that's what GitHub should do now, let employees talk about what is going wrong there.
Ghostty is leaving GitHub. I'm GitHub user 1299, joined Feb 2008. I've visited GitHub almost every single day for over 18 years. It's never been a question for me where I'd put my projects: always GitHub. I'm super sad to say this, but its time to go.
I had a prof checking file modified time in a multi-file programming homework to determine if you just copied everything from mates. Guess what, I refactored my code before submitting. Only knew it when confronting him after a shockingly low score.
teachers requiring you to submit your assignments as google docs so they can look through the edit history and tell you didn't use ai is killing the "do it at the last minute" crowd.
MLX's implementation of RDMA (Remote Direct Memory Access) over Thunderbolt on macOS, can now be used as an independent library by anyone:
It is the gem that powers Mac clusters for local AI, and is an order of magnitude faster than protocols over TCP.
A nice step forward. We are still bottlenecked at human reviewers though, I wonder if we should just have a category of model implementations that maintained by AI and contributors, and only have reviewers verify the correctness of test suite.
A long time coming but new mlx-lm is here with better batching support in the server and Gemma 4.
pip install -U mlx-lm
Here is a video where a single M3 Ultra serves 5 opencode sessions with Gemma 4 26B that process ~130k tokens in ~1.5 minutes.
We are hosting Ollama's MLX meetup this Thursday night (April 9th) at Ollama's office in Palo Alto at 6pm.
Come meet amazing people!
RSVP is required as space is very limited.
Food & drinks will be available.
More details:
I successfully got mlx-node running in the browser! I've implemented a WebGPU backend that can run Qwen3.5 0.8b.
Currently, it's still full precision bf16 (f32 in WebGPU) and hasn't undergone any optimizations; it just runs as is, but it looks like there could be many interesting things to do in the future!
We have been expecting this since ollama's first pull request to MLX. It is just the beginning, CUDA & CPU backends are still improving and hopefully we will have one framework unifying inference & training for all platforms.
Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework.
This change unlocks much faster performance to accelerate demanding work on macOS:
- Personal assistants like OpenClaw
- Coding agents like Claude Code, OpenCode, or Codex