本文由 AI 分析生成
建立時間: 2026-03-22 來源: https://x.com/vasuman/status/2011923440769659132
Summary
A production-oriented guide to building AI agents from someone who spent 3 years at Meta building systems that processed billions of transactions. The core distinction: agents act on goals (not instructions), and handle exceptions by routing to humans rather than failing silently. Every production agent requires three components — perception, decision logic, and action interface — and the most common failure is skipping the planning step before execution.
一篇以生產環境為導向的 AI Agent 建構指南,作者曾在 Meta 建構處理數十億交易的系統。核心區別:Agent 根據目標(而非指令)行動,並透過路由給人類來處理例外情況,而非靜默失敗。每個生產 Agent 需要三個組件——感知、決策邏輯和動作介面——而最常見的失敗是在執行前跳過規劃步驟。
Key Points
- Agent vs automation: agents handle goals (“ensure customer gets a response within 4 hours”) vs instructions (“send this email”); key difference is exception handling
- The agent loop: observe state → decide action → take action → observe result → repeat until goal met or stopping condition hit
- 3 required components: Perception (APIs, DBs, document stores), Decision Logic (structured trees for routine + model for ambiguous), Action Interface (logged, reversible where possible, gated by permissions)
- Tools: model doesn’t execute tools — it returns a structured request; orchestration layer validates, executes, captures result, feeds back
- Good tool design: one tool = one thing, clear success/failure states, structured return data
- Memory pattern: context window for active work, external memory (files/DBs) for history; every decision logged for audit trail
- Planning step is most commonly skipped and most important: breaks goal into steps, identifies dependencies, surfaces edge cases before execution
- Failure modes: retry with backoff (transient), human-in-the-loop (uncertain), safe failures (never delete old data)
- Guardrails are hard limits the agent can’t bypass; permissions are RBAC enforced by orchestration layer (agent doesn’t know about them)
- Start narrow: one agent, one thing, reliably — then expand scope
Insights
- “Agents handle exceptions” is the definitional boundary between automation and agency — the value proposition is not speed but resilience; an agent that routes correctly on failure is more valuable than one that succeeds 90% of the time and silently corrupts 10%
- The orchestration layer as the enforcement point (not the model) is architecturally important: permissions and guardrails live outside the model, meaning they can’t be prompted around — this is the correct security model
- External memory as audit trail reframes memory from a capability concern to a debugging/governance tool — knowing what the agent decided and why at each step is what makes production systems maintainable
- The 80/20 deployment pattern (agents handle 80% routine, route 20% complex to humans) echoes the AI moats article’s 80/20 rule — in both cases, the insight is that human judgment should be reserved for genuinely ambiguous situations
- “You’ll learn more from one real implementation than from reading ten more articles” — this is meta-applicable to this vault: the clippings are inputs, but building something is the only way to internalize the patterns
Connections
- AI Agents
- Automation
- Claude Code
- Orchestration
- Planning
- Everyone using AI has about 12 months to develop these 3 moats
- 你不知道的 Claude Code:架构、治理与工程实践
Raw Excerpt
agents handle the 80% of cases that are straightforward, and they route the 20% that are actually complex to humans who can apply judgment. The goal is to free the expertise of humans to focus on the problems that actually require it.