注册并分享邀请链接,可获得视频播放与邀请奖励。

Pavlo Molchanov
@PavloMolchanov
Director of Research @NVIDIA
加入 March 2014
436 正在关注    3.9K 粉丝
We are releasing Star Elastic - turn ONE reasoning LLM into MANY sizes with a single post-training run. 360× cheaper than pretraining a family of models. 7× better than SOTA compression. Split reasoning capability. Plus elastic budget control that beats the accuracy-latency frontier. Paper: HF models: Thread 👇
显示更多
0
4
130
37
转发到社区