Effective Context Engineering for AI Agents

本文由 AI 分析生成

建立時間： 2026-03-28 來源： https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents

Summary

Anthropic Engineering’s framing of “context engineering” as the natural successor to prompt engineering — from crafting words in prompts to managing the entire token state available to an LLM during inference. Covers context rot, attention budgets, system prompt altitude, tool design, just-in-time context retrieval, and message history pruning strategies.

Anthropic 工程部門將「上下文工程」定位為提示工程的自然繼承者——從在提示中精心組織語言，到管理 LLM 推理期間可用的整個 token 狀態。涵蓋上下文腐化、注意力預算、系統提示高度、工具設計、即時上下文檢索和消息歷史剪枝策略。

Key Points

Context engineering: optimizing the full token state (system prompt + tools + examples + message history + retrieved data) for desired agent behavior — broader than prompt engineering
Context rot: as context grows, LLM recall and reasoning quality degrade; treat context as a finite resource with diminishing marginal returns
Attention budget: transformer O(n²) attention over n tokens; more tokens = more competition for each relationship; models are “stretched thin” at long contexts
System prompt altitude: Goldilocks zone — not so specific that it’s brittle, not so vague that it provides no signal; organize with XML/Markdown sections
Tools: minimal viable set; tools should be self-contained and have non-overlapping functionality; if a human can’t decide which tool to use, neither can an LLM
Just-in-time context: agents maintain lightweight identifiers (file paths, query templates) and dynamically load data at runtime rather than front-loading everything
Message history: summarize completed subtask steps rather than retaining raw tool call/response chains; strip boilerplate; prune early successful steps

Insights

The “context rot” concept is the key empirical finding this article builds on. The mechanism (n² attention relationships spread thin) explains why LLM performance degrades with context length in a principled way — it’s not a hard cliff but a gradient. The “just-in-time context” framing generalizes Claude Code’s approach (targeted queries rather than loading entire codebases) to agentic systems broadly: agents should act like humans navigating a file system, not like LLMs trying to memorize a library. The “smallest possible set of high-signal tokens” principle is the operational formula for context engineering.

Connections

Raw Excerpt

Good context engineering means finding the smallest possible set of high-signal tokens that maximize the likelihood of some desired outcome. Context, therefore, must be treated as a finite resource with diminishing marginal returns.

bot_vault

Explorer

Effective Context Engineering for AI Agents

Summary

Key Points

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks