Register and share your invite link to earn from video plays and referrals.

Search results for HarnessEngineering
HarnessEngineering community
One keyword maps to one global community path.
Create community
People
Not Found
Tweets including HarnessEngineering
Harness Engineering Practices P6. Multi-Layer Verification Pyramid (Match Verification Frequency to Loop Position) 🎯 Point Run a 10-minute E2E test every step and your agent's thinking stops every 10 minutes. Verification needs a "weight class" design. 📝 Overview Fast, cheap checks (type checking, lint) run every step. Medium checks (unit tests) run at milestones. Slow, expensive checks (integration, E2E) run at completion gates. Match verification cadence to loop position to minimize the latency tax. 🔍 Explanation Running all verification every step is inefficient; batching it all at completion is too risky. This practice applies test pyramid thinking to verification timing. Type checks and lint finish in seconds, so running them every step barely impacts latency. E2E tests take minutes and would severely hurt agent throughput if run frequently. By placing fast verification inside the loop and slow verification outside, you achieve both feedback speed and coverage. 🛠 How to Practice - Classify verification into 3 tiers: every step (type checks, lint — seconds), milestones (unit tests — tens of seconds), completion gate (integration, E2E — minutes) - Embed verification timing into the harness loop structure so each tier fires automatically at the right moment - Measure latency per tier and narrow test scope or parallelize if inner-loop verification is too slow - Periodically audit for coverage gaps between tiers and close them 💼 Use Cases - Pair programming: run only type checks instantly on each edit, run only affected tests - Issue-to-PR agents: verify with unit tests during implementation, run full CI before completion - CI auto-maintenance: triage with cheap checks, run full tests only for repair verification ⚠ Pitfalls Getting the layers wrong costs both speed and quality. Heavy checks run too often kill latency; only light checks let regressions escape. Overconfidence in lightweight checks ("type check passed, we're good") is also dangerous. Understand exactly what each layer covers and verify there are no gaps between layers. #HarnessEngineering# #TestingStrategy#
Show more
New blog post 📝 "Buzzword Engineering" Prompt Engineering, Context Engineering, Harness Engineering, and now Loop Engineering — at least four "Engineerings" have been born in just a few years since LLMs arrived. Why does the next name keep arriving before the previous one has matured into a real methodology? 🤔 In this post, I name and dissect this phenomenon: Buzzword Engineering — a mode of knowledge production in which methodologies are named and shipped faster than verification can digest them. The root cause is an asymmetry of speed. LLMs drove the cost of proposing methodologies to nearly zero, and are even becoming proposers themselves. Verification, however, completes only when products are used by real users — it remains rate-limited by human behavior. Proposals move at machine speed; verification moves at human speed. Names pile up in the gap as a backlog ⚙️ But this is not a piece that sneers at buzzwords. As Schumpeter's "swarms" and the hype cycle show, proliferation is written into the standard timetable of every technological revolution — it is the first step of knowledge creation, coordinating the attention of engineers worldwide. What I propose instead is a gearbox connecting two clocks: the weekly clock of methodology and the yearly clock of product value. That gearbox is xOps. Inside it: evaluation assets that compound over time, an "autonomy budget" for operating agent delegation by observation, and one norm — if you coin a name, attach falsification conditions and an eval. Methodologies depreciate; evaluation assets compound 🚀 If you're tired of chasing new names, this one is for you. #BuzzwordEngineering# #TechTrends#
Show more