登録して招待リンクを共有すると、動画再生報酬と紹介報酬を獲得できます。

Percy Liang
@percyliang
professor of computer science @Stanford @stanfordnlp, co-founder of @togethercompute, creator of co-founder of @simile_ai, pianist
参加 October 2009
426 フォロー中    104.5K ファン
Marin is using quantile balancing from @Jianlin_S (who developed RoPE, which was also a good idea) to train our current 1e23 FLOPs MoE. The idea is elegant: assigning tokens to experts by solving a linear program. No hyperparameters to tune. Yields stable training.
もっと見る
Researchers' brilliant ideas often get lost in the sea of endless SOTA claims on weak baselines. At Marin we battle-test ideas in an open arena, where anyone's idea can be promoted to the next hero run. One that recently rose up was @Jianlin_S MoE Quantile Balancing, used in our last 1e22 and ongoing 130B run. Animated visuals of how QB performed are available in the OpenAthena blog.
もっと見る