註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Jiayi Pan
@jiayi_pirate
Research | Prev @xAI @Berkeley_AI | Views Are My Own
加入 September 2021
1.6K 正在關注    14.4K 粉絲
We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: Here's what we learned 🧵
顯示更多
0
192
6.3K
1.2K
轉發到社區