註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Pavlo Molchanov
@PavloMolchanov
Director of Research @NVIDIA
加入 March 2014
436 正在關注    3.9K 粉絲
We are releasing Star Elastic - turn ONE reasoning LLM into MANY sizes with a single post-training run. 360× cheaper than pretraining a family of models. 7× better than SOTA compression. Split reasoning capability. Plus elastic budget control that beats the accuracy-latency frontier. Paper: HF models: Thread 👇
顯示更多
0
4
130
37
轉發到社區