本文由 AI 分析生成
Summary
Deep dive comparing OpenAI Codex CLI and Anthropic Claude Code architectures. Both use a single-agent ReAct loop (Think → Tool Call → Observe → Repeat). Key differentiator: Codex uses shell-first tools (cat, grep, find, apply_patch) with safety encoded in the system prompt; Claude Code uses structured tool APIs.
深度比較 OpenAI Codex CLI 和 Anthropic Claude Code 架構。兩者都使用單代理 ReAct 循環(思考→工具調用→觀察→重複)。主要差異:Codex 使用以 shell 為首的工具(cat、grep、find、apply_patch),安全性編碼在系統提示中;Claude Code 使用結構化工具 API。
Key Points
- Both use single-agent ReAct loops: single-threaded, sequential conversation history, no multi-agent concurrency
- Codex “mini-API” in system prompt: explicitly teaches the model how to invoke tools with examples (apply_patch format etc.)
- Codex design principles: shell-first tools, surgical diffs via apply_patch, sandbox/no-network defaults, approval gates for risky commands
- Codex: implicit iteration (read → edit → test) rather than upfront planning
- Shared insight: single-loop simplicity is deliberately chosen for debuggability and predictability
- Tool choice often comes down to “vibes” and system prompt style — both tools are roughly equivalent in capability
Insights
The “mini-API in system prompt” pattern (explicitly teaching the model how to call tools via examples in the prompt, not just tool definitions) is interesting — it suggests that even with native function calling, reinforcing the tool usage pattern with in-prompt examples improves consistency. The observation that “the real race is between Anthropic and OpenAI” and that tool choice often comes down to “vibes” reflects a real phenomenon: at this level of capability, differentiation is increasingly about UX and workflow fit rather than raw model quality.
Connections
Raw Excerpt
Codex encodes its operating model explicitly in the prompt: a single-agent ReAct loop with a tight tool contract and a “keep working until done” bias. The prompt teaches a shell-first toolkit and reserves file mutation for a strict apply_patch envelope, pushing the model toward minimal, surgical diffs rather than whole-file rewrites.