Register and share your invite link to earn from video plays and referrals.

Lisan al Gaib
@scaling01
lead them to paradise LisanBench: Impressum & Datenschutz:
Joined August 2024
1K Following    43.9K Followers
new forecasting benchmark: FutureSim GPT-5.5 performs the best at 25%, but Mythos, Gemini 3.1 Pro and Opus 4.7 are not included. Based on their Brier Skill Score the models don't seem to be much better than just assigning equal probabilities to all outcomes
Show more