注册并分享邀请链接,可获得视频播放与邀请奖励。

Rohan Paul
@rohanpaul_ai
Compiling in real-time, the race towards AGI. The Largest Show on X for AI. 🗞️ Get my daily AI analysis newsletter to your email 👉
加入 June 2014
7.4K 正在关注    148.4K 粉丝
Terence Tao says the math behind today’s LLMs is actually simple. Training and running them mostly uses linear algebra, matrix multiplication, and a bit of calculus, material an undergraduate can handle. We understand how to build and operate these models. The real mystery is why they work so well on some tasks and fail on others, and why we cannot predict that in advance. We lack good rules for forecasting performance across tasks, so progress is largely empirical. A key reason is the nature of real-world data. Pure noise is well understood, perfectly structured data is well understood, but natural text sits in between, partly structured and partly random. Mathematics for that middle regime is thin, similar to how physics struggles at meso-scales between atoms and continua. Because of this gap, we can describe the mechanisms but cannot yet explain capability jumps or give reliable task-level predictions. That mismatch, simple machinery versus hard-to-predict behavior, is the core puzzle. ---- Video from 'Dr Brian Keating' YT Channel (Link in comment)
显示更多
0
97
3.2K
575
转发到社区