登録して招待リンクを共有すると、動画再生報酬と紹介報酬を獲得できます。

Jinjie Ni
@NiJinjie
Research Scientist @GoogleDeepMind
参加 April 2020
651 フォロー中    3.6K ファン
Token crisis: solved. ✅ We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs. Findings: > DLMs beat AR when tokens are limited, with >3× data potential. > A 1B DLM trained on just 1B tokens hits 56% HellaSwag & 33% MMLU — no tricks, no cherry-picks. > No saturation: more repeats = more gains. 🚨 ” We also dissected the serious methodological flaws in our parallel work “Diffusion Beats Autoregressive in Data-Constrained Settings” — let’s raise the bar for open review! 🔗 Blog & details: 18 🧵s ahead:
もっと見る