Karpathy: 2025 LLM Year in Review

本文由 AI 分析生成

建立時間： 2026-03-26 來源： https://karpathy.bearblog.dev/year-in-review-2025/

Summary

Andrej Karpathy’s 2025 LLM year-in-review identifies six paradigm shifts: RLVR unlocking reasoning at scale, LLMs as fundamentally “ghosts not animals” (jagged alien intelligence), Cursor revealing a new thick app layer on top of LLMs, Claude Code redefining what AI on your computer looks like, vibe coding enabling anyone to build software, and Google’s Nano Banana hinting at what an LLM GUI could become.

Andrej Karpathy 的 2025 年度回顧指出六大典範轉移：RLVR 讓推理能力大幅躍進、LLM 是「鬼魂而非動物」的鋸齒狀智能、Cursor 揭示 LLM 應用層的厚度、Claude Code 重新定義了「活在你電腦上的 AI」、Vibe Coding 讓任何人都能寫程式，以及 Google Nano Banana 預示了 LLM 的 GUI 未來。

Key Points

RLVR enables longer optimization runs; capability improved through RL run length, not just model size
“Ghosts not animals”: LLMs are alien intelligence, simultaneously genius and confused — benchmarks are now gameable via RLVR
Claude Code is notable for running on your computer with your context/secrets/config; Anthropic got the localhost-first order right vs. OpenAI’s cloud-first approach
Vibe coding: code is free, ephemeral, malleable — anyone can build software now
Nano Banana (Gemini): first hint of true LLM GUI — joint text+image+world knowledge in model weights
Cursor demonstrated the “LLM app” layer: context engineering + multi-LLM orchestration + app-specific GUI + autonomy slider

Insights

Karpathy’s “ghosts not animals” framing is the most conceptually important part: because LLMs are optimized on entirely different objectives (imitating text, solving puzzles, getting upvotes) than animals (jungle survival), they develop deeply alien capability profiles. This explains both the benchmark gaming problem and why “AGI” comparisons to humans are fundamentally misleading. The jagged capability profile is a direct consequence of RLVR spiking performance in verifiable domains.

Connections

Raw Excerpt

Everything about the LLM stack is different (neural architecture, training data, training algorithms, and especially optimization pressure) so it should be no surprise that we are getting very different entities in the intelligence space, which are inappropriate to think about through an animal lens.

bot_vault

Explorer

Karpathy: 2025 LLM Year in Review

Summary

Key Points

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks