We've actually gone farther than this. Nemotron 3 Super (120B-12A) was pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra was also pretrained in NVFP4.
This research paper advances the state of NVFP4 pretraining but it is not just research, we are using NVFP4 for our most important pretraining work.
显示更多