Register and share your invite link to earn from video plays and referrals.

Kevin Patrick Murphy
@sirbayes
Research Scientist at Google DeepMind. Interested in Bayesian Machine Learning.
Joined October 2016
663 Following    69.7K Followers
I am pleased to announce another update to my RL tutorial ( This time I have added code for RLFT for multi-turn LLM agents, using the awesome Tinker library from @thinkymachines, and the simple ReBN training loop from GEM by @zzlccc et al. With ~100 lines of simple python running on your laptop, you can train an agent based on Qwen3-4B-Instruct to play "guess the number" in 20 minutes.
Show more
0
14
1.1K
148
Forward to community