Register and share your invite link to earn from video plays and referrals.

Bolian Li
@lblaoke
PhD Candidate @PurdueCS | Interning @Apple MLR | Reinforcement Learning, Bayesian Deep Learning, Large Language Models
Joined October 2023
183 Following    83 Followers
Scaling up RL training with more data often encounters the performance saturation, which wastes compute. We find that a precisely crafted entropy curve is all you need to avoid performance saturation, and we achieve this purely by rejection sampling.
Show more