MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations

本文由 AI 分析生成

建立時間： 2026-04-05 來源： https://arxiv.org/abs/2310.17596

Summary

MimicGen is a data generation system that amplifies a small number of human demonstrations (~10–200) into over 50,000 diverse synthetic demonstrations across 18 manipulation tasks. It works by decomposing demonstrations into object-centric segments and rigidly transforming them to new scene configurations, then validating generated trajectories via physics simulation. Policies trained on MimicGen data achieve 59–96% success depending on task complexity.

MimicGen 是一個資料生成系統，將少量人類示範（10–200 個）放大成 50,000+ 個多樣化合成示範，覆蓋 18 種操作任務。透過將示範分解為物件中心片段並剛性變換到新場景配置，再用物理模擬驗證，解決機器人學習的資料規模瓶頸。

Prerequisites

Imitation learning / Behavior Cloning — MimicGen generates data to train BC policies; understanding IL is needed to interpret the results
Object-centric representations — the decomposition step assumes rigid-body object-centric segments; scenes with deformable or freely interacting objects break the assumption
Robot simulation (MuJoCo/Robosuite) — generation and validation happen entirely in simulation; understanding sim physics helps interpret success/failure rates
HDF5 dataset format — output stored in HDF5 compatible with Robomimic; knowing the format helps with downstream training pipeline

Core Idea

The key insight is that robot manipulation demonstrations have an object-centric structure: each subtask is a motion relative to a specific object, and that relative motion can be transformed to new scene configurations via rigid-body SE(3) transforms. MimicGen exploits this by (1) segmenting human demos at subtask boundaries, (2) retaining each segment’s object-relative motion, and (3) recomposing transformed segments with motion planning to bridge transitions. Physics simulation acts as a free quality filter — infeasible trajectories simply fail and are discarded.

Results

Metric	Result
Demos generated from <200 human demos	50,000+
Data multiplier (10 demos → ~1,000)	~100×
BC success range across 18 tasks	59–96%
Square task: generated vs human demo	79% vs 84% (comparable)
DexMimicGen (2024): humanoid bimanual	Scales to 22-DoF systems

Limitations

Author-stated: assumes rigid-body objects; deformable objects (cloth, fluids) not supported
Author-stated: generated data quality degrades for tasks requiring many sequential decisions
Unstated: the 50–70% generation success rate means ~30–50% of compute is wasted on failed trajectories
Unstated: the sim-to-real gap means generated demonstrations need real-world fine-tuning before deployment
Unstated: “mixed-quality” generated trajectories — RoboCasa365 found MimicGen data quality lower than human demos, though scale compensates

Reproducibility

Code: open-source at github.com/NVlabs/mimicgen
Datasets: Robosuite simulation environments (MuJoCo); 18 task environments provided
Compute: demo generation ~30 min/1,000 demos on GPU; BC training ~30 min/1,000 epochs

Insights

MimicGen inverts the conventional data collection bottleneck: instead of asking “how do we collect more demonstrations?” it asks “how do we extract more information from the demonstrations we have?” The object-centric decomposition is the enabling insight — it means demonstrations are not rigid sequences but composable motion primitives that can be recombined.

The comparable performance of generated vs human data (79% vs 84% on Square) raises a deeper question: if generated data quality is close to human data quality at a fraction of the cost, what is the marginal value of human teleoperation for well-covered tasks? The answer likely depends on task complexity — for long-horizon, contact-rich tasks, human judgment still dominates.

DexMimicGen (2024) extending this to bimanual humanoid hands (22 DoF) is a significant step: it suggests the object-centric decomposition paradigm scales to more complex morphologies.

Connections

Raw Excerpt

“We generate over 50,000 demonstrations from less than 200 human demonstrations across 18 tasks, multiple simulators, and the real-world.”

bot_vault

Explorer

MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations

Summary

Prerequisites

Core Idea

Results

Limitations

Reproducibility

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks