가입 후 초대 링크를 공유하면 동영상 재생 및 초대 보상을 받을 수 있습니다.

cv usk
@cv_usk
AI / Software Research Notes AI Agent, LLMOps, MLOps, Software Architecture
가입 May 2026
240 팔로잉 중    207
🌳 AI is finally evolving from a tool that runs one-off experiments into something closer to a researcher itself, one that compounds knowledge across time. A new autonomous research framework grows hypotheses as a single living tree. Title: Toward Generalist Autonomous Research via Hypothesis-Tree Refinement URL: 🔍 Overview This work proposes Arbor, a framework for long-term autonomous research. Its core idea, Hypothesis-Tree Refinement, links hypotheses, the artifacts produced by experiments, the evidence gathered, and the insights distilled from them into a single persistent tree. With every experiment the tree is updated, continuously sharpening the search frontier that decides which direction to explore next. ❓ Challenges Solved Previous LLM research agents could barely manage single, isolated experiments. ・They could not maintain a big-picture strategy about which hypothesis to pursue across multiple attempts ・Lessons learned in one experiment were rarely carried into the next, so exploration started from scratch each time ・They lacked a mechanism to tell promising branches from dead ends and allocate limited compute accordingly In short, knowledge never compounded, and that was the real bottleneck. 💡 Methodology & Proposed Approach Arbor is built from two kinds of agents and a persistent tree that connects them. ・A long-lived coordinator acts as the strategist, surveying the hypothesis tree and choosing what to test next. Because it persists across sessions, it preserves long-term coherence ・Short-lived executors implement and test individual hypotheses in isolated environments, then retire once their job is done ・The hypothesis tree links hypotheses, evidence, artifacts, and insights over time, propagating reusable lessons across the whole effort This turns research from a bag of isolated experiments into a cumulative process where strategy, execution, and evidence build on each other. 🎯 Use Cases Promising applications include AutoML and automated machine learning optimization that improve through continuous experimentation, as well as automating the scientific discovery process itself. The appeal is running long stretches of trial and error strategically, without a human in the loop. 📊 Experimental Results Arbor was evaluated under an Autonomous Optimization setting across six real research tasks. ・It achieved the best held-out results on all six tasks ・It recorded over 2.5x average held-out gain compared to Codex and Claude Code ・On MLE-Bench Lite it reached 86.36% Any Medal when paired with GPT-5.5, the strongest result among the systems compared #AIAgents# #AutonomousResearch#
더 보기