State-Only Imitation Learning for Dexterous Manipulation

@article{Radosavovic2021StateOnlyIL,
  title={State-Only Imitation Learning for Dexterous Manipulation},
  author={Ilija Radosavovic and Xiaolong Wang and Lerrel Pinto and Jitendra Malik},
  journal={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2021},
  pages={7865-7871}
}
Modern model-free reinforcement learning methods have recently demonstrated impressive results on a number of problems. However, complex domains like dexterous manipulation remain a challenge due to the high sample complexity. To address this, current approaches employ expert demonstrations in the form of state-action pairs, which are difficult to obtain for real-world settings such as learning from videos. In this paper, we move toward a more realistic setting and explore state-only imitation… 

Figures from this paper

NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control
TLDR
A novel hierarchical reinforcement learning framework without explicit action that tries to manipulate the next optimal state and actual action is produced by the inverse dynamics model to stabilize the training process.
Robust Learning from Observation with Model Misspecification
TLDR
A robust IL algorithm to learn policies that can effectively transfer to the real environment without fine-tuning is proposed and it is demonstrated on continuous-control benchmarks that this method outperforms the state-of-the-art state-only IL method in terms of the zero-shot transfer performance in thereal environment and robust performance under different testing conditions.
Dexterous Imitation Made Easy: A Learning-Based Framework for Efficient Dexterous Manipulation
TLDR
‘Dexterous Imitation Made Easy’ (DIME) is proposed, a new imitation learning framework for dexterous manipulation that only requires a single RGB camera to observe a human operator and teleoperate the authors' robotic hand.
Learning Feasibility to Imitate Demonstrators with Different Dynamics
TLDR
A feasibility MDP (f-MDP) is developed and the feasibility score is derived by learning an optimal policy in the f-M DP by encouraging the imitator to learn from more informative demonstrations, and disregard the far from feasible demonstrations.
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos
TLDR
A new platform and pipeline for imitation learning to bridge the gap between computer vision and robot learning, DexMV (Dexterous Manipulation from Videos), is proposed and it is shown that the demonstrations can indeed improve robot learning by a large margin and solve the complex tasks which reinforcement learning alone cannot solve.
Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning
TLDR
It is shown that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse realworld objects and generalize to new objects with unseen shape or size and it is found that multi-task learning with object point cloud representations not only generalizes better but even outperforms the single-object specialist policies on both training as well as held-out test objects.
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation
TLDR
A novel single-camera teleoperation system to collect the 3D demonstrations efficiently with only an iPad and a computer and shows large improvement over baselines with multiple complex manipulation tasks.
DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from Video
TLDR
Through experiments on 27 objects with a 30-DoF simulated robot hand, it is demonstrated that DexVIP compares favorably to existing approaches that lack a hand pose prior or rely on specialized tele-operation equipment to obtain human demonstrations, while also being faster to train.
Dexterous Robotic Grasping with Object-Centric Visual Affordances
TLDR
The key idea is to embed an object-centric visual affordance model within a deep reinforcement learning loop to learn grasping policies that favor the same object regions favored by people.
Dexterous Manipulation for Multi-Fingered Robotic Hands With Reinforcement Learning: A Review
TLDR
The purpose is to present a comprehensive review of the techniques for dexterous manipulation with multi-fingered robotic hands, such as the model-based approach without learning in early years, and the latest research and methodologies focused on the method based on reinforcement learning and its variations.
...
1
2
3
4
...

References

SHOWING 1-10 OF 64 REFERENCES
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration
TLDR
Reinforcement learning with imitation learning is investigated by introducing and evaluating reinforced inverse dynamics modeling (RIDM), a novel paradigm for combining imitation from observation (IfO) and reinforcement learning with no dependence on demonstrator action information.
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
TLDR
This work proposes a model-free deep reinforcement learning method that leverages a small amount of demonstration data to assist a reinforcement learning agent and trains end-to-end visuomotor policies that map directly from RGB camera inputs to joint velocities.
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
TLDR
This work shows that model-free DRL with natural policy gradients can effectively scale up to complex manipulation tasks with a high-dimensional 24-DoF hand, and solve them from scratch in simulated experiments.
Deep Dynamics Models for Learning Dexterous Manipulation
TLDR
It is shown that improvements in learned dynamics models, together with improvements in online model-predictive control, can indeed enable efficient and effective learning of flexible contact-rich dexterous manipulation skills -- and that too, on a 24-DoF anthropomorphic hand in the real world, using just 4 hours of purely real-world data to learn to simultaneously coordinate multiple free-floating objects.
State Alignment-based Imitation Learning
TLDR
This work proposes a novel state alignment-based imitation learning method to train the imitator by following the state sequences in the expert demonstrations as much as possible, and combines them into a reinforcement learning framework by a regularized policy update objective.
Hybrid Reinforcement Learning with Expert State Sequences
TLDR
This paper proposes a novel tensor-based model to infer the unobserved actions of the expert state sequences of an expert, while the expert actions are unobserved, and proposes a hybrid objective combining reinforcement learning and imitation learning.
Provably Efficient Imitation Learning from Observation Alone
TLDR
FAIL is the first provably efficient algorithm in ILFO setting, which learns a near-optimal policy with a number of samples that is polynomial in all relevant parameters but independent of the number of unique observations.
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
TLDR
This work proposes an imitation learning method based on video prediction with context translation and deep reinforcement learning that enables a variety of interesting applications, including learning robotic skills that involve tool use simply by observing videos of human tool use.
Learning dexterous in-hand manipulation
TLDR
This work uses reinforcement learning (RL) to learn dexterous in-hand manipulation policies that can perform vision-based object reorientation on a physical Shadow Dexterous Hand, and these policies transfer to the physical robot despite being trained entirely in simulation.
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
TLDR
A general and model-free approach for Reinforcement Learning on real robotics with sparse rewards built upon the Deep Deterministic Policy Gradient algorithm to use demonstrations that out-performs DDPG, and does not require engineered rewards.
...
1
2
3
4
5
...