Summary

Anthropic’s announcement of MCP Tool Search for Claude Code — a lazy-loading mechanism that dynamically loads MCP tools into context only when needed, rather than preloading all tool definitions upfront. The trigger threshold is 10% of context budget: if MCP tool descriptions would exceed that, Tool Search activates and tools are retrieved via search instead. This directly addresses the top GitHub feature request (issue #7336), where users documented setups with 7+ MCP servers consuming 67K+ tokens.

Anthropic 宣布 Claude Code 的 MCP Tool Search 功能——一種動態惰性載入機制,只在需要時才將 MCP 工具定義載入上下文,而非預先全部載入。觸發閾值為上下文預算的 10%:若 MCP 工具描述超過該閾值,Tool Search 啟動,工具透過搜索而非預載入方式取用。這直接解決了 GitHub 最多人要求的功能(issue #7336),用戶曾記錄 7 個以上 MCP 服務器消耗 67K+ token 的設置。

Key Points

  • Trigger: activates when MCP tool descriptions would use >10% of context (otherwise tools preload as before)
  • Implementation: tools loaded via search on demand, not preloaded into every request
  • For MCP server authors: “server instructions” field becomes more important — it tells Claude when to search for your tools (analogous to skill descriptions)
  • For MCP client authors: implement ToolSearchTool with a custom search function (docs at platform.claude.com)
  • Resolves: users with 7+ MCP servers consuming 67K+ tokens of fixed overhead per request
  • Programmatic tool composition (MCP tools calling each other via code) was explored but deferred — Tool Search was the higher priority

Insights

  • This is the architectural answer to the problem documented in “你不知道的 Claude Code”: MCP Server tool definitions were the largest hidden context cost (~4-6K tokens per server), and users were told to limit servers as a workaround — Tool Search removes that constraint
  • The 10% threshold is a principled cutoff: at that level, the fixed overhead is large enough to justify the overhead of dynamic search rather than preloading
  • The “server instructions” field becoming more important parallels the skill description optimization insight: both are the “when should I use this?” signal that the model scans before deciding to load full content
  • Deferring programmatic tool composition is interesting: it suggests Anthropic views natural language tool orchestration (model decides which tools to call) as more robust than code-level composition for now
  • This feature makes the “install as many MCP servers as you want” use case viable — previously, server count was directly capped by context budget

Connections

Raw Excerpt

Users were documenting setups with 7+ servers consuming 67k+ tokens. Tool Search allows Claude Code to dynamically load tools into context when MCP tools would otherwise take up a lot of context.