Register and share your invite link to earn from video plays and referrals.

METR
@METR_Evals
We work to scientifically measure whether and when AI systems might threaten catastrophic harm to society. Nonprofit.
Joined September 2023
35 Following    24.4K Followers
We evaluated an early version of Claude Mythos Preview for risk assessment during a limited window in March 2026. We estimated a 50%-time-horizon of at least 16hrs (95% CI 8.5hrs to 55hrs) on our task suite, at the upper end of what we can measure without new tasks.
Show more
0
69
2.1K
248
Forward to community