登録して招待リンクを共有すると、動画再生報酬と紹介報酬を獲得できます。

Zihan "Zenus" Wang
@wzenus
Reasoning agent / RL / efficiency research @NorthwesternU & incoming @nvidia. Ex @Microsoft @yutori_ai @deepseek_ai @uiuc_nlp @RUC1937.
参加 March 2022
665 フォロー中    23K ファン
In Agent RL, models suffer from Template Collapse. They generate vast, diverse outputs (High Entropy) that lose all meaningful connection to the input prompt (Low Mutual Information). In other words, agent learn different ways to say nothing. 🚀 Introducing RAGEN-v2 -- Here's how we define and fix such silent failure modes in Agent RL. 🧵
もっと見る