Register and share your invite link to earn from video plays and referrals.

Eric Xu (e/Mettā)
@xleaps
polymath, polyglot, root of a ternary tree. building prev @Meta @Google @Reddit phd in classic ai; rookie pilot 🛩️; martial artist
3.5K Following    34.8K Followers
20 years ago, my first startup was all about enterprise search. Two decades later, we’re still building search engines. The technology has shifted from NLP to NN and the users from humans to agents. but searching is still the core. opensource the fastest bm25 engine:
Show more
And the antidote is motivated reasoning to dismantle your own hypothesis.
The enemy of truth is motivated reasoning.
Claude design is genuinely impressive. Preparing to ship the next version of Noah in a few days, and I cannot wait to see those beautiful dynamic panels rendered in an intuitive way. --- 争取下一个版本的 Noah 能够做到这么漂亮!
Show more
The High-Dimensional Ball Is Full, but Its Probability Mass Lives Near the Boundary High-dimensional probability is weird. As the dimension grows, a uniform point in the ball does not usually land deep inside. Most of the mass gets pushed into a very thin shell near the boundary.
Show more
Friends don’t let friends use OpenClaw Try Hermes agent and thank me later.
what a world! I made a video with claude yesterday using manim, and now hermes agent has it as a skill! it's the best time to be alive.
Introducing the Manim skill for Hermes Agent. Manim is an engine for creating precise programmatic animations for mathematical and technical explainers, made famous by the @3blue1brown channel.
Show more
Always loved @3blue1brown's visualizations but never really conquered Manim (the animation library). With Claude as a coding agent, I can finally direct animations at a high level — no more fighting the library. So I built this: explaining to 12-year-old me why fractals have non-integer dimensions. D = log N / log r. Simple formula. Surprisingly deep rabbit hole. --- 终于获得了课件自由:用 AI 可以随时讲解一些知识,比如这是给当年的我讲解为什么分形维度不是整数的一个视频。
Show more
I have one hour to burn through my Claude token limit. I’ve resorted to feeding it the Linux kernel. This is fine. Genuine question though — what repos are worth having AI actually dig into? Send help (or links). — 今天的 Claude token 用不完。 我:行吧,去把 Linux 内核全读一遍。 有哪些代码库值得 AI 认真啃?求推荐👇
Show more
#BuildInPublic# I am open-sourcing an AI simulation engine: SGO (Semantic Gradient Optimization) You build something. You think it's ready. But you have no idea how actual people will react, and that reaction sequence is your product's real roadmap. User research takes weeks and still misses scenarios. You can ask an LLM to role-play a buyer persona, sure, but you get back one data point shaped entirely by whatever role you made up beforehand. SGO takes a different route: simulate against census-aligned synthetic populations. NVIDIA open-sourced Nemotron-Personas-USA, a dataset of one million synthetic Americans built on top of US Census distributions. These aren't the "25-year-old tech worker" archetypes an LLM invents on the fly. They're construction workers in suburban Illinois, artisans in rural Texas, single parents in New York, each with hobbies, habits, and priorities that reflect real demographic distributions in age, education, occupation, income. Paste in whatever you want to optimize: a product landing page, a fundraising pitch, a blog post draft. One user ran his dating profile through it. SGO picks an optimization target and audience for you, then samples from the million-person pool, stratifies by segment, runs each persona through a counterfactual evaluation, and stacks up a ranked list of what to change first and why. About 30 seconds per run. Around $0.10 in API costs. Code is open-source. Live demo on HuggingFace Spaces. Also works as a standalone Skill you can drop into an inner loop with auto-research. HF Space: Things people have already run through it: resumes, business plans, app UX flows, billboard copy, logos, landing page layouts, dating profiles, and one dessert shop's name.
Show more
#BuildInPublic# 开源了一个 AI 模拟引擎 SGO (语义梯度优化引擎) 在 AI 世界迭代产品或者功能,目前最最缺少的就是现实世界现实用户的反馈;这些反馈意见序列实际上构成了产品的演化路径。 然而用户(真人)反馈周期较长,且不能覆盖所有的情景。当下,我们常常让 LLM "假装"某一类用户得到一个近似的反馈,但这种反馈都是一个一个的数据点,完全取决于事先规划好了的角色。 SGO 采用的思路是: 用和人口普查对齐的合成数据来模拟真人用户。NVIDIA 开源了多个主权数据集,比如对于美国,Nemotron-Personas-USA 数据集里有一百万个基于美国人口普查数据生成的合成人物。不是那种 LLM 随便编的"有着三十年经验的工程师",而是有完整背景的人——伊利诺伊郊区的建筑工人、德州农村的手工艺人、纽约的单亲妈妈等等。他们有各自的爱好、习惯、关注点。这些人的年龄、学历、职业、收入分布都跟真实人口一致。 SGO 的采样, 模拟和梯度计算框架可以让你直接从这些人里拿到反馈,周期大约 30 秒,LLM API 花费大约 $0.10。 使用方法也很简单:把要优化的东西贴进去,比如产品描述、融资 pitch、一则爆款文章等等(有一个用户甚至把他的约会 profile 放进去优化)。总之什么都行。 SGO 会很科学的帮你自动选择优化目标和目标受众。确定好以后,从这 100 万个有机数据人群中科学采样 (stratified sampling)、分类聚类、逐一询问反馈(contrafactual inquiry)、对照目标,逐一构建所谓的"语义梯度" (相当于目标对于各个变量的 Jacobian 矩阵), 以及最终的汇总反馈和迭代方向。 代码开源,目前部署在 HuggingFace Spaces 上可以直接试用。 你可以把 SGO 作为 Skill 单独使用,也可以把它放在一个内循环里,和 auto-research 联合使用。 HF Space: 希望 SGO 和 auto-resesarch 结合,帮助大家优化那些跨越数字世界和现实世界的许多场景。 PS: 现有的跑通的场景 * 简历优化 * 商业计划 * App UX 设计 * 广告牌设计 * LOGO * 网页的版式和颜色 * 约会档案 * 一个甜点屋的名字
Show more
The AI Scientist: Towards Fully Automated AI Research, Now Published in Nature Nature: Blog: When we first introduced The AI Scientist, we shared an ambitious vision of an agent powered by foundation models capable of executing the entire machine learning research lifecycle. From inventing ideas and writing code to executing experiments and drafting the manuscript, the system demonstrated that end-to-end automation of the scientific process is possible. Soon after, we shared a historic update: the improved AI Scientist-v2 produced the first fully AI-generated paper to pass a rigorous human peer-review process. Today, we are happy to announce that “The AI Scientist: Towards Fully Automated AI Research,” our paper describing all of this work, along with fresh new insights, has been published in @Nature! This Nature publication consolidates these milestones and details the underlying foundation model orchestration. It also introduces our Automated Reviewer, which matches human review judgments and actually exceeds standard inter-human agreement. Crucially, by using this reviewer to grade papers generated by different foundation models, we discovered a clear scaling law of science. As the underlying foundation models improve, the quality of the generated scientific papers increases correspondingly. This implies that as compute costs decrease and model capabilities continue to exponentially increase, future versions of The AI Scientist will be substantially more capable. Building upon our previous open-source releases ( this open-access Nature publication comprehensively details our system's architecture, outlines several new scaling results, and discusses the promise and challenges of AI-generated science. This substantial milestone is the result of a close and fruitful collaboration between researchers at Sakana AI, the University of British Columbia (UBC) and the Vector Institute, and the University of Oxford. Congrats to the team! @_chris_lu_ @cong_ml @RobertTLange @_yutaroyamada @shengranhu @j_foerst @hardmaru @jeffclune
Show more
Wow! Exo can be patched to have a CUDA backend and now Apple Silicon and GB 10 can be used together (pipeline parallel) for inference. Not sure about the network bandwidth, but the sheer human ingenuity is amazing. Honestly Anthropic and OpenAI only have one or two years of the 50% interference margin (and that's all because of the memory shortage for civilians). Token margin will collapse as TSMC is pumping all these chips that can be connected. --- 给 exo 加上了 CUDA 后端后 mlx 和 cuda 设备实现了 pipeline 并行(把模型拆成前一半给一个节点 后一半给另一个节点) 现在个人使用本地推理的最大瓶颈已经不是算力, 而是存储;苹果设备的多设备联网推理已经走通 如果能够走通足够带宽下的异构设备联网推理 则大大降低本地推理成本 催生尽管平均输出律低于数据中心 API, 但二十四小时使用开源模型不断推理的新用例
Show more