Learning Dexterous Manipulation Skills from Imperfect Simulations

本文由 AI 分析生成

建立時間： 2026-03-28 來源： https://dexscrew.github.io

Summary

Hsieh et al. (UC Berkeley, arXiv:2512.02011) present DexScrew, a sim-to-real framework for contact-rich dexterous manipulation (screwdriving, nut-bolt fastening) that explicitly handles imperfect simulation. Key insight: RL in simplified simulation learns transferable rotational motion primitives; these primitives bootstrap real-world teleoperation to collect tactile demonstrations, enabling final behavior cloning policies that generalize to unseen object geometries.

Hsieh 等人（加州大學伯克利分校，arXiv:2512.02011）提出了 DexScrew，一個針對接觸豐富的靈巧操作（螺絲刀擰緊、螺母螺栓緊固）的 sim-to-real 框架，明確處理不完美的仿真。核心洞察：在簡化仿真中的 RL 學習可轉移的旋轉運動原語；這些原語引導真實世界的遙操作收集觸覺示範，最終使行為克隆策略能夠泛化到未見過的物體幾何形狀。

Prerequisites

Reinforcement learning for robotics (policy gradients, sim-to-real)
Dexterous hand manipulation; multi-fingered grasping
Tactile sensing and proprioception
Behavior cloning / imitation learning

Core Idea

Two classic approaches fail for contact-rich dexterous manipulation: (1) sim-to-real RL requires accurate physics simulation of complex contact dynamics and tactile sensing (intractable); (2) teleoperation-based imitation learning requires high-quality dexterous demonstrations (hard to collect at scale due to human-robot morphology gap).

DexScrew’s three-stage hybrid:

RL in simplified simulation: train on simplified object models that capture the essential rotational structure but ignore fine contact details → learn correct finger gaits (motion primitives)
Skill-assisted teleoperation: use the sim-trained policy as a skill primitive to guide human teleoperation in the real world → enables efficient collection of contact-rich demonstrations with tactile + proprioceptive data
Behavior cloning with tactile sensing: train BC policy on real-world demonstrations → generalizes to diverse object geometries and is robust to perturbations

Results

High task progress ratios vs. direct sim-to-real transfer on screwdriving and nut-bolt fastening
Generalization to nuts/screwdrivers with diverse geometries not seen in training
Robust performance under external perturbations
Code and videos at dexscrew.github.io

Limitations

Author-stated:

Framework assumes the core motion primitive (rotational skill) can be learned in simplified simulation; may not generalize to tasks where the full contact dynamics are critical even for primitive learning

Unstated:

Teleoperation quality still bounds the quality of real-world demonstrations
Three-stage pipeline adds complexity vs. end-to-end approaches
Evaluation scope limited to screwdriving/fastening; generality to other dexterous tasks unstated

Reproducibility

Code/Data: Project page at dexscrew.github.io (code available per project page)
Hardware: Multi-fingered robot hand with tactile sensing
Compute: Standard RL + behavior cloning training scale

Insights

The key architectural insight is that motion primitives (the “how” of rotation) are learnable from simplified simulation, even if the full contact dynamics are not. This separates the skill acquisition problem (learned in sim) from the sensing/dynamics problem (learned from real demonstrations). It’s an elegant decomposition that sidesteps the sim-to-real gap without ignoring simulation entirely. The skill-primitive-as-teleoperation-aid is also clever: it enables the human to focus on high-level decisions while the robot handles low-level rotational control.

Connections

Raw Excerpt

The key idea is that the motion primitives underlying contact-rich dexterous manipulation do not need to be learned from a perfect physics model. A simplified simulator is sufficient to induce the core rotational behaviors required for these tasks.

bot_vault

Explorer

Learning Dexterous Manipulation Skills from Imperfect Simulations

Summary

Prerequisites

Core Idea

Results

Limitations

Reproducibility

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks