注册并分享邀请链接,可获得视频播放与邀请奖励。

xAI
@xai
加入 May 2023
5 正在关注    2M 粉丝
Humanity's Last Exam (HLE) is a rigorous intelligence benchmark featuring over 2500 problems crafted by experts in mathematics, natural sciences, engineering, and humanities. Most models score single-digit accuracy. Grok 4 and Grok 4 Heavy outperform all others.
显示更多
0
51
612
85
转发到社区