Register and share your invite link to earn from video plays and referrals.

Greg Brockman
@gdb
President & Co-Founder @OpenAI
Joined July 2010
8 Following    976K Followers
extremely interesting work from our alignment team
Chain of thought monitors are a key layer of defense against AI agent misalignment. To preserve monitorability, we avoid penalizing misaligned reasoning during RL. We found a limited amount of accidental CoT grading which affected released models, and are sharing our analysis.
Show more