We are releasing Star Elastic - turn ONE reasoning LLM into MANY sizes with a single post-training run.
360ร cheaper than pretraining a family of models.
7ร better than SOTA compression.
Split reasoning capability.
Plus elastic budget control that beats the accuracy-latency frontier.
Paper:
HF models:
Thread ๐