Register and share your invite link to earn from video plays and referrals.

Haider.
@haider1
together, we build an intelligent future.
Joined November 2021
3.8K Following    66.3K Followers
really cool benchmark for long-horizon test-time adaptation gpt-5.5 in codex leads on FutureSim, where agents interact with a chronological replay of real-world news and are tasked with predicting future events on some Polymarket questions, gpt-5.5 even moved ahead of the human market aggregate interestingly, gemini 3.1 and opus 4.7 are missing
Show more