註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Bryan Catanzaro
@ctnzr
VP, Applied Deep Learning Research @ NVIDIA
加入 February 2011
474 正在關注    26K 粉絲
We've actually gone farther than this. Nemotron 3 Super (120B-12A) was pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra was also pretrained in NVFP4. This research paper advances the state of NVFP4 pretraining but it is not just research, we are using NVFP4 for our most important pretraining work.
顯示更多
0
6
145
20
轉發到社區