Register and share your invite link to earn from video plays and referrals.

Chris ๐Ÿ‡จ๐Ÿ‡ฆ
@llm_wizard
Open Source Model Lover @ NVIDIA AI Views my own.
620 Following    2.3K Followers
ironically think itโ€™ll be a sad time for ai researchers this year. they are first in the hotpath of RSI and probably the market for them will shrink or at least their pricing power will be reduced as this generation of models commoditizes the skills that made them rare
Show more
0
119
1.1K
37
Forward to community
We've actually gone farther than this. Nemotron 3 Super (120B-12A) was pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra was also pretrained in NVFP4. This research paper advances the state of NVFP4 pretraining but it is not just research, we are using NVFP4 for our most important pretraining work.
Show more
So grateful for this experience and for all the wonderful people Iโ€™ve met on this journey
Over the last week, we had to say goodbye to the little orange menace we affectionately refered to as "The Boy". Hug your pets just a little tighter - they're too good for us.
We've gone even farther: Nemotron 3 Super is 120B and pretrained on 25T tokens in NVFP4. Nemotron 3 Ultra is ~500B and also pretrained in NVFP4. Accelerated computing means we rethink every aspect of the AI stack looking for new opportunities to improve efficiency.
Show more
I agree strongly with both Xeo and the blurb. We need open source AI.
amazing post and great timing w.r.t. ant's post yesterday we must build open ai to not get locked in by the vendors who will decide who gets which capabilities and the west has to realize that open models are important and support open model efforts (like @arcee_ai, @NVIDIAAI)
Show more
Imagine the generational aura loss if you said: "China is so cracked compared to the US that if we had a fair compute playing field we would definitely lose for sure despite having a literal massive headstart and infrastructure/supply chain advantage."
Show more
NEMOTRON MENTIONED. (Also, Ultra mentioned
LUCAS ATKINS IS HERE TO TALK ABOUT TRAINING BIG MODELS ๐Ÿซก
I love it when two things I love do cool stuff together.
suuuuper excited to be collaborating with the excellent LangChain Labs team on this effort prod agent tracing is the seed that lets you close the loop for continual learning. too much data gets collected but not used for learning. time to change that :)
Show more
This guy is only optimizing optimizing. Not even optimizing the optimization of optimizing.
weโ€™re not even optimizing, weโ€™re optimizing optimizing
What is a claw? ๐Ÿฆž It's the shift from AI that suggests โ†’ AI that acts. Autonomous agents that run 24/7, handling complex work in the background so you don't have to.
Time to yap on some smol MoEโ€™s today. If youโ€™re around AI council, my talk is at 10! Followed by the ๐Ÿโ€™s of @latkins, @ezi_ozoani, @llm_wizard, and @samsja19 Everything from pretraining at home to large scale RL
Show more
told you to not sleep on MiMo. what they have accomplished in such a short span is remarkable, their first (7B dense) llm was released exactly a year ago
Finally got freeway early access with Waymo. My boy is MOTORING.
Special guests at the Ask me Anything booth. @NVIDIA! Stop by during lunch!
We are releasing Star Elastic - turn ONE reasoning LLM into MANY sizes with a single post-training run. 360ร— cheaper than pretraining a family of models. 7ร— better than SOTA compression. Split reasoning capability. Plus elastic budget control that beats the accuracy-latency frontier. Paper: HF models: Thread ๐Ÿ‘‡
Show more