Reinforcement and Imitation Learning for Diverse Visuomotor Skills

@article{Zhu2018ReinforcementAI,
  title={Reinforcement and Imitation Learning for Diverse Visuomotor Skills},
  author={Yuke Zhu and Ziyun Wang and Josh Merel and Andrei A. Rusu and Tom Erez and Serkan Cabi and Saran Tunyasuvunakool and J{\'a}nos Kram{\'a}r and Raia Hadsell and Nando de Freitas and Nicolas Manfred Otto Heess},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.09564}
}
We propose a model-free deep reinforcement learning method that leverages a small amount of demonstration data to assist a reinforcement learning agent. We apply this approach to robotic manipulation tasks and train end-to-end visuomotor policies that map directly from RGB camera inputs to joint velocities. We demonstrate that our approach can solve a wide variety of visuomotor tasks, for which engineering a scripted controller would be laborious. In experiments, our reinforcement and imitation… 

Figures and Tables from this paper

Reinforcement Learning for Diverse Visuomotor Skills
TLDR
A model-free deep reinforcement learning method that leverages a small amount of demonstration data to assist a reinforcement learning agent that achieves significantly better performances than agents trained with reinforcement learning or imitation learning alone is proposed.
Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations
TLDR
A case study of a learning-based approach for target-driven mapless navigation that outperforms both standalone approaches in the amount of successful navigation tasks and can be significantly simplified when using pretraining, e.g., by using a sparse reward only.
CRIL: Continual Robot Imitation Learning via Generative and Prediction Model
TLDR
This work proposes a novel trajectory generation model that employs both a generative adversarial network and a dynamics-aware prediction model to generate pseudo trajectories from all learned tasks in the new task learning process, thus reducing the burden of multitask IL and accelerating the process of new tasklearning at the same time.
State-Only Imitation Learning for Dexterous Manipulation
TLDR
This paper trains an inverse dynamics model and uses it to predict actions for state-only demonstrations and considerably outperforms RL alone, and is able to learn from demonstrations with different dynamics, morphologies, and objects.
Task-Oriented Deep Reinforcement Learning for Robotic Skill Acquisition and Control
TLDR
An efficient model-free off-policy actor–critic algorithm for robotic skill acquisition and continuous control is presented, by fusing the task reward with a task-oriented guiding reward, which is formulated by leveraging few and imperfect expert demonstrations.
Combining learned skills and reinforcement learning for robotic manipulations
TLDR
This work proposes a RL policies operating on pre-trained skills, that can learn composite manipulations using no intermediate rewards and no demonstrations of full tasks, and shows successful learning of policies for composite manipulation tasks such as making a simple breakfast.
Residual Reinforcement Learning from Demonstrations
TLDR
Examination on simulated manipulation tasks demonstrates that residual RL from demonstrations is able to generalize to unseen environment conditions more flexibly than either behavioral cloning or RL fine-tuning, and is capable of solving high-dimensional, sparse-reward tasks out of reach for RL from scratch.
Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks
TLDR
This work presents a new method, able to leverage demonstrations and episodes collected online in any sparse-reward environment with any off-policy algorithm, based on a reward bonus given to demonstrations and successful episodes, encouraging expert imitation and self-imitation.
Vision-Based Deep Reinforcement Learning For UR5 Robot Motion Control
TLDR
The results indicate that the vision-based DRL method proposed in the paper can successfully learn the reaching-task skill and the utilization of asymmetric actor-critic structure and auxiliary-task objective can improve the learning efficiency and the final performance of the D RL method effectively.
Learning to combine primitive skills: A step towards versatile robotic manipulation §
TLDR
This work aims to overcome previous limitations and propose a reinforcement learning (RL) approach to task planning that learns to combine primitive skills and proposes an efficient training of basic skills from few synthetic demonstrations by exploring recent CNN architectures and data augmentation.
...
...

References

SHOWING 1-10 OF 62 REFERENCES
Learning human behaviors from motion capture by adversarial imitation
TLDR
Generative adversarial imitation learning is extended to enable training of generic neural network policies to produce humanlike movement patterns from limited demonstrations consisting only of partially observed state features, without access to actions, even when the demonstrations come from a body with different and unknown physical parameters.
One-Shot Visual Imitation Learning via Meta-Learning
TLDR
A meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration, and requires data from significantly fewer prior tasks for effective learning of new skills.
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
TLDR
This work proposes an imitation learning method based on video prediction with context translation and deep reinforcement learning that enables a variety of interesting applications, including learning robotic skills that involve tool use simply by observing videos of human tool use.
Continuous control with deep reinforcement learning
TLDR
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Deep Reinforcement Learning for Robotic Manipulation
TLDR
It is demonstrated that a recent deep reinforcement learning algorithm based on offpolicy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
TLDR
It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
End-to-End Training of Deep Visuomotor Policies
TLDR
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
Learning Dexterous Manipulation Policies from Experience and Imitation
TLDR
This work shows that local trajectory-based controllers for complex non-prehensile manipulation tasks can be constructed from surprisingly small amounts of training data, and collections of such controllers can be interpolated to form more global controllers.
Overcoming Exploration in Reinforcement Learning with Demonstrations
TLDR
This work uses demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm.
Asymmetric Actor Critic for Image-Based Robot Learning
TLDR
This work exploits the full state observability in the simulator to train better policies which take as input only partial observations (RGBD images) and combines this method with domain randomization and shows real robot experiments for several tasks like picking, pushing, and moving a block.
...
...