注册并分享邀请链接,可获得视频播放与邀请奖励。

Bryan Catanzaro
@ctnzr
VP, Applied Deep Learning Research @ NVIDIA
加入 February 2011
474 正在关注    26K 粉丝
We've actually gone farther than this. Nemotron 3 Super (120B-12A) was pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra was also pretrained in NVFP4. This research paper advances the state of NVFP4 pretraining but it is not just research, we are using NVFP4 for our most important pretraining work.
显示更多
0
6
145
20
转发到社区