Robotics Changes Everyday but It Is Still the Same Three Things

本文由 AI 分析生成

建立時間： 2025-01-01

Summary

EN: A practitioner’s account of iterating through multiple robot learning approaches with a SO-100 arm. The author tried: PPO (reinforcement learning, sim2real with IsaacLab), FPO (flow policy, struggled with exploding gradients), FPO++ (improved version), ACT (Action Chunking with Transformers, 50 demonstration episodes), and SmolVLA fine-tuning (vision-language-action model). The article reflects on the common underlying principles across all methods despite the rapidly changing landscape.

ZH: 作者記錄以 SO-100 機械手臂實際嘗試多種機器人學習方法的歷程：PPO（強化學習 + IsaacLab sim2real）、FPO（流策略，遇到梯度爆炸問題）、FPO++（改進版）、ACT（動作分塊 Transformer，使用 50 段示範資料），以及 SmolVLA 微調（視覺語言動作模型）。文章反思在快速變化的技術景觀下各方法共通的底層原則。

Key Points

PPO: model-free RL, requires careful sim environment design for sim2real transfer; IsaacLab used for simulation
FPO (Flow Policy): continuous action distribution via normalizing flows; author encountered exploding gradient issues
FPO++: improved stability over FPO; resolved gradient issues
ACT: 50 demonstration episodes sufficient for simple manipulation tasks; chunked actions improve smoothness
SmolVLA: HuggingFace’s small VLA model; fine-tunable with modest compute
The “same three things” framing: perception, planning, and action remain the core challenges regardless of method

Insights

50 episodes for ACT is surprisingly few — the implication is that demonstration quality matters more than quantity for simple tasks
The progression from RL → flow policies → VLAs reflects the field’s broader trajectory: from hand-crafted rewards to imitation to language-conditioned generalization
Exploding gradients in FPO on real robot data suggest that flow policies are more sensitive to distribution shift than RL methods

Connections

RH20T dataset in this vault: the data collection challenge ACT faces (needing demos) is exactly what RH20T addresses at scale
Sunday.ai Memo robot: also uses imitation learning via a specialized data capture glove
Open source robotics stack article: ACT is part of LeRobot library, which Arne Baeyens recommends

Raw Excerpt

“After going through PPO, FPO, FPO++, ACT, and SmolVLA in a few months, I realized the tools change constantly but the problem doesn’t: you need the robot to perceive its environment, plan an action, and execute it reliably. Every new method is just a different answer to the same three questions.”

bot_vault

Explorer

Robotics Changes Everyday but It Is Still the Same Three Things

Summary

Key Points

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks