注册并分享邀请链接,可获得视频播放与邀请奖励。

Rohan Paul
@rohanpaul_ai
Compiling in real-time, the race towards AGI. The Largest Show on X for AI. 🗞️ Get my daily AI analysis newsletter to your email 👉
加入 June 2014
7.4K 正在关注    149.4K 粉丝
Better search may come less from smarter indexes than from giving agents a richer way to touch text. Shows that AI agents using basic terminal tools like grep, file reads, and shell commands to search raw data perform far better than conventional retrieval systems on multiple benchmarks. On BrowseComp-Plus, swapping semantic retrieval for terminal search raised accuracy from 69% to 80% while lowering cost. The deeper point is not that grep is magically smarter than embeddings. It is that retrieval is usually treated as a model problem, when it is also an interface problem. A conventional retriever turns the corpus into a narrow ritual: ask once, receive a ranked list, reason over whatever survived. That works when the question is close to a document’s semantic center, but it breaks when the answer depends on exact phrases, faint clues, document structure, or a chain of small discoveries. Direct Corpus Interaction changes the shape of the task. The agent can search an exact string, inspect nearby context, notice a new entity, constrain the search again, and keep testing its hypothesis against the raw files. Here’s the part most people miss: the gain was not mainly from finding more gold documents, but from extracting more usable evidence once a promising document was reached. That makes DCI less like a better search engine and more like giving the model fingers. The limitation is real: as the corpus grows, the cost of finding the first useful anchor rises quickly, and blunt terminal search will not replace indexes for every large, static collection. But the paper’s lesson still lands cleanly. For capable agents, the bottleneck may no longer be only what they know, or even how they reason, but how much of the world their tools allow them to touch. ---- Paper Link – arxiv. org/abs/2605.05242 Paper Title: "Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction"
显示更多