註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Zihan "Zenus" Wang
@wzenus
Reasoning agent / RL / efficiency research @NorthwesternU & incoming @nvidia. Ex @Microsoft @yutori_ai @deepseek_ai @uiuc_nlp @RUC1937.
加入 March 2022
665 正在關注    23K 粉絲
In Agent RL, models suffer from Template Collapse. They generate vast, diverse outputs (High Entropy) that lose all meaningful connection to the input prompt (Low Mutual Information). In other words, agent learn different ways to say nothing. 🚀 Introducing RAGEN-v2 -- Here's how we define and fix such silent failure modes in Agent RL. 🧵
顯示更多
0
12
257
60
轉發到社區