Bolian Li(@lblaoke )

Bolian Li

@lblaoke

PhD Candidate @PurdueCS | Interning @Apple MLR | Reinforcement Learning, Bayesian Deep Learning, Large Language Models

183 Following 83 Followers

Bolian Li@lblaoke

2026.05.13 07:05

Scaling up RL training with more data often encounters the performance saturation, which wastes compute. We find that a precisely crafted entropy curve is all you need to avoid performance saturation, and we achieve this purely by rejection sampling.