Register and share your invite link to earn from video plays and referrals.

Search results for LLMOps
LLMOps community
One keyword maps to one global community path.
Create community
People
Not Found
Tweets including LLMOps
Coding agents fire dozens of API calls per task, so a single developer can quietly burn thousands of dollars a week before anyone notices — here's how LangChain killed that "spend unpredictability" internally 💸 The key was folding budget control into the same place as observability. Title: How LangChain Made Coding Agent Spend Predictable URL: 💸 Overview An LLM Gateway built into LangSmith gives a company-wide, minute-level view of model spend and manages budgets centrally. Rather than a bolt-on proxy, it sits on the same foundation as existing tracing, evaluation, and user management. ❓ Challenges Solved Model usage spread from a few teams to the whole company, and premium model prices rose, so costs spiked. ・Coding agents trigger dozens of API calls per task ・Individual developers ran up thousands of dollars a week, unnoticed until month-end 💡 Methodology & Proposed Approach Budgets can be set across multiple layers. ・Caps at the organization, workspace, user, and API-key level ・Default monthly, weekly, daily, and hourly windows for all employees, with exceptions for heavy projects ・Covers agents accessed via Claude Code, Codex, and LangChain Deep Agents ・Deployed via MDM so no one has to set it up manually ・Runs are traced and tied to a user and API key; overspend can be diagnosed by inspecting the trace with evaluation data 🌍 Use Cases Engineering leaders can set team-level limits while still letting people use agents without fear of a surprise bill. The practical value is replacing the month-end billing shock with real-time monitoring. 📊 Lessons & Outcomes ・Static price tables go stale fast, so pricing must be handled dynamically, including caching and tier differences ・Cursor and Claude Desktop didn't route cleanly, so they measured the delta between Gateway-captured traffic and provider settings to correct for it ・Hard limits alone block real work, so they evolved into early-warning alerts and auditable budget-increase requests ・Since internal rollout, LLM costs have stayed within budget #CodingAgents# #LLMOps#
Show more
Logging alone won't prevent incidents. Meet runtime governance that blocks problematic requests before they reach the LLM 🛡️ Title: LangSmith LLM Gateway: runtime governance built into the agent lifecycle URL: 🛡️ Overview A runtime governance layer that sits between agents and their model providers. As an enforcement point inside the LangSmith platform, it aims to stop problems at the source rather than just logging them after the fact. ❓ Challenges Solved Observability (logging) alone can't prevent problems. Logging an incident after it happens is too late — problematic requests should be blocked before they reach external LLM providers. 💡 How It Works ・Spend controls: hard caps at org, workspace, user, or API-key level; returns 402 when exceeded ・Cost visibility: real-time spend tracking across org units ・Data protection: auto-redaction of PII and secrets before they reach the model ・Trace integration: gateway-proxied calls appear in the same workspace ・Audit logging and layered policies Setup is minimal: point base_url at the Gateway, store provider keys in workspace secrets, and define policies in the LangSmith UI. 🌍 Use Cases ・Preventing runaway agent spend from retry loops ・Stopping sensitive data (SSNs, PII) from leaking into provider logs ・Establishing org-wide cost governance and compliance auditing #LLMOps# #AIGovernance#
Show more