登録して招待リンクを共有すると、動画再生報酬と紹介報酬を獲得できます。

METR
@METR_Evals
We work to scientifically measure whether and when AI systems might threaten catastrophic harm to society. Nonprofit.
参加 September 2023
35 フォロー中    24.4K ファン
We evaluated an early version of Claude Mythos Preview for risk assessment during a limited window in March 2026. We estimated a 50%-time-horizon of at least 16hrs (95% CI 8.5hrs to 55hrs) on our task suite, at the upper end of what we can measure without new tasks.
もっと見る