Register and share your invite link to earn from video plays and referrals.

Keller Jordan
@kellerjordan0
CIFAR-10 fanatic Pretraining @OpenAI OpCo LLC.
428 Following    16.9K Followers
New modded-NanoGPT optimization benchmark result: @wen_kaiyue has improved upon both the Muon and AdamW baselines, by replacing their weight decay with hyperball optimization. The new record is 3325 steps.
Show more