Top AI Papers of the Week (March 23

本文由 AI 分析生成

建立時間： 2026-03-30 來源： https://x.com/dair_ai/status/2038262704486400218

Summary

DAIR.AI’s weekly digest of top AI papers for March 23–29, 2026. The ten papers span self-improving agents (Hyperagents), AGI framing (Agentic AI and the Next Intelligence Explosion), a new agentic benchmark (ARC-AGI-3), an autonomous red-teaming system (Claudini), a Transformer architecture improvement (Attention Residuals), cross-agent memory sharing (MemCollab), a specialized coding model (Composer 2), turn-level RL for long-horizon tasks (PivotRL), a survey on agent workflow optimization, and a brain-inspired multi-agent reasoning system (BIGMAS).

本週最值得關注的是三個方向的交匯：自我改進代理（Hyperagents）、機構式 AI 對齊（Agentic AI 報告）和代理能力的新評估框架（ARC-AGI-3）。同時 Claudini 展示了 autoresearch 風格的自動化研究如何在安全領域產生真實成果。

Key Points

Hyperagents：讓 meta-level 修改程序本身也可編輯，實現元認知自我修改（不只改任務表現，也改如何改進的機制）。DGM-Hyperagents 消除了任務能力與自我修改能力必須對齊的假設，開放任意可計算任務的自我改進
Agentic AI & Intelligence Explosion：Google 研究員論點——下一次智能爆炸將是社會性的而非個體性的。前沿推理模型內部模擬「思維社會」而非線性思考；AI 對齊必須從二元（RLHF）轉向制度性設計（數位協議模仿組織與市場）
ARC-AGI-3：人類 100% vs. 頂尖 AI < 1% 的巨大差距。互動式回合制設計，要求代理探索、推斷目標、建立動態環境模型，是截至 2026 年 3 月唯一未飽和的通用代理智能基準
Claudini：Claude Code 驅動的 autoresearch pipeline 自主發現新對抗性攻擊算法，在 CBRN 查詢上攻擊成功率 40%（vs. 所有現有方法的 10% 以下）。白盒紅隊測試特別適合自動化研究，因為優化目標提供密集量化反饋
Attention Residuals (AttnRes)：用 softmax attention 替換 Transformer 中的固定單位權重殘差連接，允許每層以內容相關的方式聚合前層輸出，解決 PreNorm 稀釋問題。Kimi 48B MoE 模型上驗證效果
Composer 2（Cursor）：持續預訓練 + 大規模 RL 的兩階段訓練；在與部署相同的 harness 中訓練（train-in-harness）。Terminal-Bench 61.7、SWE-bench Multilingual 73.7
PivotRL（NVIDIA）：識別 SFT 軌跡中的「pivot」（高方差決策點），集中訓練信號在關鍵轉折，用 4× 更少的 rollout 匹配端到端 RL 的精度，已用於 Nemotron-3-Super-120B 生產訓練

Insights

ARC-AGI-3 的設計哲學揭示了當前 AI 的核心缺陷：AI 在靜態模式識別上已接近人類（ARC-AGI-1 93%、ARC-AGI-2 68.8%），但在需要主動探索、建立內部環境模型並適應的任務上幾乎為零（< 1%）。這說明目前的 LLM 本質上是記憶機器而非自適應推理機。

Claudini 是「autoresearch 用於安全研究」的實際案例，也驗證了一個重要原則：當評估指標是密集且可量化的（攻擊成功率），autoresearch 風格的代理能產生比人類更好的結果。這個模式在什麼領域可以複製，是重要的設計問題。

Composer 2 的「train-in-harness」策略——在與部署完全相同的環境中訓練——解決了訓練/部署分佈偏移問題，這是許多代理系統失敗的根本原因之一。

Connections

Clippings-人人都是深度学习工程师：autoresearch如何改变训练方法论 — autoresearch 方法論的深度解析，Claudini 是其安全領域的直接應用
multi-agent
reinforcement-learning
benchmarks

Raw Excerpt

“Humans can solve 100% of the environments while frontier AI systems score below 1%. For comparison, systems reach 93% on ARC-AGI-1 and 68.8% on ARC-AGI-2, but performance collapses on ARC-AGI-3.”

bot_vault

Explorer

Top AI Papers of the Week (March 23–29, 2026)

Summary

Key Points

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks