登録して招待リンクを共有すると、動画再生報酬と紹介報酬を獲得できます。

Lisan al Gaib
@scaling01
lead them to paradise LisanBench: Impressum & Datenschutz:
参加 August 2024
1K フォロー中    43.9K ファン
new forecasting benchmark: FutureSim GPT-5.5 performs the best at 25%, but Mythos, Gemini 3.1 Pro and Opus 4.7 are not included. Based on their Brier Skill Score the models don't seem to be much better than just assigning equal probabilities to all outcomes
もっと見る