注册并分享邀请链接,可获得视频播放与邀请奖励。

Jiayi Pan
@jiayi_pirate
Research | Prev @xAI @Berkeley_AI | Views Are My Own
加入 September 2021
1.6K 正在关注    14.4K 粉丝
We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: Here's what we learned 🧵
显示更多
0
192
6.3K
1.2K
转发到社区