a new paper from Anthropic Fellows Program!
"Model Spec Midtraining: Improving How Alignment Training Generalizes"
A lot of alignment training teaches models what to say, but not why those behaviors are right.
So before normal alignment fine-tuning, this research trains the model on synthetic documents that discuss its Model Spec, including its values, rules, and reasoning, then do the usual supervised alignment.
On agentic misalignment evals, MSM + AFT cuts misalignment from 68% -> 5% on Qwen2.5-32B and 54% -> 7% on Qwen3-32B, beating the baseline.
The gain shows up especially out of distribution, where normal “say the aligned thing” training can look good in QA but break in harder scenarios.
Show more
NEW research from FAIR at Meta, Cornell, and CMU.
This paper is a bigger deal than it seems.
Apparently, you don't need billions of parameters to teach an AI model to reason.
The default approach to post-training language models for reasoning today remains finetuning millions or even billions of parameters.
But what if the signal needed for reasoning is far sparser than we assume?
This new research introduces TinyLoRA, a method that scales low-rank adapters down to as few as a single trainable parameter.
Using TinyLoRA with RL, they trained Qwen2.5-7B to 91% accuracy on GSM8K with only 13 parameters in bf16. That's 26 total bytes.
So what's the idea?
RL and SFT require fundamentally different amounts of model capacity. SFT must absorb the full demonstration, encoding both task-relevant structure and irrelevant noise into the update. RL receives a sparser, cleaner signal. The reward separates what matters from what doesn't, so resampling amplifies useful information while noise cancels out.
Here are the results:
On GSM8K, models trained with GRPO reach 90% accuracy with fewer than 100 parameters. Models of the same capacity trained with SFT barely outperform the base model. On harder benchmarks like MATH500, AIME, and AMC, finetuning just 196 parameters retains 87% of the absolute performance improvement averaged across six benchmarks.
The trend scales with model size, too. Larger models need proportionally smaller updates, suggesting trillion-scale models may be trainable for many tasks with just a handful of parameters.
The key takeaway is that reasoning may already live inside pretrained models. RL doesn't inject new knowledge; it surfaces what's already there, and it can do so with almost no parameter change at all.
Paper:
Learn to build effective AI agents in our academy:
Show more
xLSTM excels in time series forecasting: .
Introduces "stochastic xLSTM" (StoxLSTM).
"StoxLSTM consistently outperforms state-of-the-art baselines with better robustness and stronger generalization ability."
TiRex shows that xLSTM is time series king.
Show more
@DavidSHolz @NicolasPerezNi1 Not entirely clear to me, I've mostly worked on the audiovisual modalities since 2023 so I wasn't around for this😬
The diffusion duality paper is nice in this regard, it potentially enables discrete models to access some of those theoretical advantages
Show more
New survey on diffusion language models: (via
@NicolasPerezNi1). Covers pre/post-training, inference and multimodality, with very nice illustrations.
I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲
Show more
Token crisis: solved. ✅
We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs.
Findings:
> DLMs beat AR when tokens are limited, with >3× data potential.
> A 1B DLM trained on just 1B tokens hits 56% HellaSwag & 33% MMLU — no tricks, no cherry-picks.
> No saturation: more repeats = more gains.
🚨 ”
We also dissected the serious methodological flaws in our parallel work “Diffusion Beats Autoregressive in Data-Constrained Settings” — let’s raise the bar for open review!
🔗 Blog & details:
18 🧵s ahead:
Show more
Andrej Karpathy shares a 3-step blueprint on how to master anything
New blog post about diffusion language models:
Diffusion models have completely taken over generative modelling of perceptual signals -- why is autoregression still the name of the game for language modelling? And can we do anything about that?
Show more
This is interesting as a first large diffusion-based LLM.
Most of the LLMs you've been seeing are ~clones as far as the core modeling approach goes. They're all trained "autoregressively", i.e. predicting tokens from left to right. Diffusion is different - it doesn't go left to right, but all at once. You start with noise and gradually denoise into a token stream.
Most of the image / video generation AI tools actually work this way and use Diffusion, not Autoregression. It's only text (and sometimes audio!) that have resisted. So it's been a bit of a mystery to me and many others why, for some reason, text prefers Autoregression, but images/videos prefer Diffusion. This turns out to be a fairly deep rabbit hole that has to do with the distribution of information and noise and our own perception of them, in these domains. If you look close enough, a lot of interesting connections emerge between the two as well.
All that to say that this model has the potential to be different, and possibly showcase new, unique psychology, or new strengths and weaknesses. I encourage people to try it out!
Show more
This is the most powerful Deep Research AI prompt I've ever used
It can literally make you thousands of dollars
The AI will do research based on your niche and give you a DETAILED plan on how to build software for that niche (no experience required)
BOOKMARK THIS
Prompt:
I create content about [YOUR SUBJECT OR NICHE HERE]. I want you to perform thorough, in-depth research on this niche by analyzing its common pain points, the root causes of those problems, and the types of solutions that exist (if any).
1. Identify Key Challenges:
• Provide me with 5 major challenges that people in my niche frequently encounter.
• Explain each challenge in detail, focusing on:
• Why it occurs
• Who is most impacted
• What current (if any) solutions or workarounds exist
2. Propose Software Solutions:
• For each of the 5 challenges, propose one unique software idea that could solve or significantly reduce that challenge.
• Break each software idea down into:
• Core Functionality: What does it do? How does it address the challenge directly?
• Key Features: List 3–5 critical features that make it stand out from existing solutions.
• Value Proposition: Clearly explain how it benefits users, saves time/money, or simplifies tasks compared to other tools on the market.
• Potential Tech Stack / Implementation Notes: If applicable, suggest frameworks, languages, or libraries that might be well-suited to build this solution.
3. Cite Sources & Data Points (If Available):
• If you refer to any statistics, facts, or expert opinions, please provide references (studies, articles, or credible sources) to support the claim or finding.
4. Conclusion & Next Steps:
• Summarize why these challenges are significant.
• Emphasize how the proposed software ideas could disrupt or advance the niche.
• Suggest any further reading or research paths that could help refine these software concepts.
• Give me a detailed action plan on how I can get started building these ideas with Cursor. Act as if I have no programming experience.
At the end of your response, provide a concise action plan or checklist summarizing how to go from idea to product validation.
PLUG THIS INTO ANY DEEP RESEARCH TOOL (ChatGPT or Perplexity) AND WATCH THE MAGIC HAPPEN
Show more
We reproduced DeepSeek R1-Zero in the CountDown game, and it just works
Through RL, the 3B base LM develops self-verification and search abilities all on its own
You can experience the Ahah moment yourself for < $30
Code:
Here's what we learned 🧵
Show more
We have to take the LLMs to school.
When you open any textbook, you'll see three major types of information:
1. Background information / exposition. The meat of the textbook that explains concepts. As you attend over it, your brain is training on that data. This is equivalent to pretraining, where the model is reading the internet and accumulating background knowledge.
2. Worked problems with solutions. These are concrete examples of how an expert solves problems. They are demonstrations to be imitated. This is equivalent to supervised finetuning, where the model is finetuning on "ideal responses" for an Assistant, written by humans.
3. Practice problems. These are prompts to the student, usually without the solution, but always with the final answer. There are usually many, many of these at the end of each chapter. They are prompting the student to learn by trial & error - they have to try a bunch of stuff to get to the right answer. This is equivalent to reinforcement learning.
We've subjected LLMs to a ton of 1 and 2, but 3 is a nascent, emerging frontier. When we're creating datasets for LLMs, it's no different from writing textbooks for them, with these 3 types of data. They have to read, and they have to practice.
Show more
@alexanderchen @hapticdata very cool! when people use LLMs like this repeatedly and with very low latencies like it's some kind of free, persistent, almost disposable resource it gives me the "feel the AGI" feels.
Show more