Register and share your invite link to earn from video plays and referrals.

cv usk
@cv_usk
AI / Software Research Notes AI Agent, LLMOps, MLOps, Software Architecture
Joined May 2026
240 Following    206 Followers
Coding agents fire dozens of API calls per task, so a single developer can quietly burn thousands of dollars a week before anyone notices — here's how LangChain killed that "spend unpredictability" internally 💸 The key was folding budget control into the same place as observability. Title: How LangChain Made Coding Agent Spend Predictable URL: 💸 Overview An LLM Gateway built into LangSmith gives a company-wide, minute-level view of model spend and manages budgets centrally. Rather than a bolt-on proxy, it sits on the same foundation as existing tracing, evaluation, and user management. ❓ Challenges Solved Model usage spread from a few teams to the whole company, and premium model prices rose, so costs spiked. ・Coding agents trigger dozens of API calls per task ・Individual developers ran up thousands of dollars a week, unnoticed until month-end 💡 Methodology & Proposed Approach Budgets can be set across multiple layers. ・Caps at the organization, workspace, user, and API-key level ・Default monthly, weekly, daily, and hourly windows for all employees, with exceptions for heavy projects ・Covers agents accessed via Claude Code, Codex, and LangChain Deep Agents ・Deployed via MDM so no one has to set it up manually ・Runs are traced and tied to a user and API key; overspend can be diagnosed by inspecting the trace with evaluation data 🌍 Use Cases Engineering leaders can set team-level limits while still letting people use agents without fear of a surprise bill. The practical value is replacing the month-end billing shock with real-time monitoring. 📊 Lessons & Outcomes ・Static price tables go stale fast, so pricing must be handled dynamically, including caching and tier differences ・Cursor and Claude Desktop didn't route cleanly, so they measured the delta between Gateway-captured traffic and provider settings to correct for it ・Hard limits alone block real work, so they evolved into early-warning alerts and auditable budget-increase requests ・Since internal rollout, LLM costs have stayed within budget #CodingAgents# #LLMOps#
Show more