Teaching robots dexterous manipulation skills often requires collecting hundreds of demonstrations using wearables or teleoperation, which is challenging to scale. Videos of human-object interactions are easier to collect and scale, but leveraging them directly for robot learning is difficult due to the lack of explicit action labels and morphological differences between robot and human hands.
We propose Human2Sim2Robot, a novel real-to-sim-to-real framework for training dexterous manipulation policies using only one RGB-D video of a human demonstrating a task. Our method utilizes reinforcement learning (RL) in simulation to cross the human-robot embodiment gap without relying on wearables, teleoperation, or large-scale data collection typically necessary for imitation learning methods. From the demonstration, we extract two task-specific components: (1) the object pose trajectory to define an object-centric, embodiment-agnostic reward function, and (2) the pre-manipulation hand pose to initialize and guide exploration during RL training. We found that these two components are highly effective for learning the desired task, eliminating the need for task-specific reward shaping and tuning. We demonstrate that Human2Sim2Robot significantly outperforms trajectory retargeting and one-shot imitation learning across a wide range of tasks, including grasping, non-prehensile manipulation, and extrinsic manipulation.
✅: 60%
✅: 100%
✅: 100%
✅: 100%
✅: 100%
✅: 100%
✅: 86.6%
Snackbox Push
Snackbox Pivot
Snackbox Push Pivot
Plate Push
Plate Lift Rack
Plate Pivot Lift Rack
Pitcher Pour
Snackbox Push (✅: 10%)
Snackbox Pivot (✅: 100%)
Snackbox Push Pivot (✅: 40%)
Plate Push (✅: 0%)
Plate Lift Rack (✅: 0%)
Plate Pivot Lift Rack (✅: 0%)
Pitcher Pour (✅: 30%)
Snackbox Push (✅: 10%)
Snackbox Pivot (✅: 100%)
Snackbox Push Pivot (✅: 70%)
Plate Push (✅: 0%)
Plate Lift Rack (✅: 0%)
Plate Pivot Lift Rack (✅: 33.3%)
Pitcher Pour (✅: 50%)
Snackbox Push (✅: 30%)
Snackbox Pivot (✅: 80%)
Snackbox Push Pivot (✅: 20%)
Plate Push (✅: 0%)
Plate Lift Rack (✅: 0%)
Plate Pivot Lift Rack (✅: 0%)
Pitcher Pour (✅: 40%)
This work is supported by Stanford Human-Centered Artificial Intelligence (HAI), the National Science Foundation (NSF) under Grant Numbers 2153854 and 2342246, and the Natural Sciences and Engineering Research Council of Canada (NSERC) under Award Number 526541680.
@article{TODO,
author = {TODO},
title = {TODO},
journal = {TODO},
year = {TODO},
}