가입 후 초대 링크를 공유하면 동영상 재생 및 초대 보상을 받을 수 있습니다.

Jinjie Ni
@NiJinjie
Research Scientist @GoogleDeepMind
가입 April 2020
651 팔로잉 중    3.6K
Token crisis: solved. ✅ We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs. Findings: > DLMs beat AR when tokens are limited, with >3× data potential. > A 1B DLM trained on just 1B tokens hits 56% HellaSwag & 33% MMLU — no tricks, no cherry-picks. > No saturation: more repeats = more gains. 🚨 ” We also dissected the serious methodological flaws in our parallel work “Diffusion Beats Autoregressive in Data-Constrained Settings” — let’s raise the bar for open review! 🔗 Blog & details: 18 🧵s ahead:
더 보기
0
42
1.6K
252
커뮤니티로 전달