Register and share your invite link to earn from video plays and referrals.

Pavlo Molchanov
@PavloMolchanov
Director of Research @NVIDIA
436 Following    3.9K Followers
We are releasing Star Elastic - turn ONE reasoning LLM into MANY sizes with a single post-training run. 360× cheaper than pretraining a family of models. 7× better than SOTA compression. Split reasoning capability. Plus elastic budget control that beats the accuracy-latency frontier. Paper: HF models: Thread 👇
Show more