Summary

This is a Chinese translation and commentary of Simon Willison’s “2025: The Year in LLMs,” covering the major paradigm shifts in AI over 2025: the rise of RLVR reasoning models, the breakout of coding agents (especially Claude Code), the dominance of Chinese open-source models, and new safety concerns around AI browsers and prompt injection.

這是 Simon Willison《2025 大語言模型年度回顧》的中文翻譯,涵蓋 2025 年 AI 領域的重大典範轉移:RLVR 推理模型崛起、編碼 Agent 爆發(尤其是 Claude Code)、中國開源模型稱霸排行榜,以及 AI 瀏覽器與 Prompt Injection 的安全隱患。

Key Points

  • RLVR (Reinforcement Learning from Verifiable Rewards) became the dominant new training stage, enabling reasoning models
  • Claude Code launched quietly in Feb 2025 but became most impactful AI product of the year; annualized $1B revenue by December
  • Chinese open-source models (DeepSeek, Qwen, Kimi, GLM, MiniMax) swept top 5 spots on Artificial Analysis leaderboard by year end
  • METR found AI can handle tasks requiring human hours; doubling time ~7 months
  • “Lethal trifecta” coined: prompt injection + tool access + private data = critical security risk
  • MCP standard emerged and may already be fading; Anthropic’s Skills format proposed as simpler alternative
  • “Slop” named Merriam-Webster word of the year

Insights

The translation reveals how quickly the AI landscape shifted in a single year from theoretical agents to production-grade coding assistants. The simultaneous rise of Chinese open-source models alongside Western proprietary models means that by end of 2025, open-weight models were genuinely competitive with frontier closed models. This democratization has both capability and governance implications.

Connections

Raw Excerpt

2025 年最具影響力的大事,是 2 月 Anthropic 靜悄悄地發布了 Claude Code,甚至沒單獨發博客,只是夾在 Claude 3.7 Sonnet 的公告裡。