Meta Reinforcement Learning for Sim-to-real Domain Adaptation

@article{Arndt2020MetaRL,
  title={Meta Reinforcement Learning for Sim-to-real Domain Adaptation},
  author={Karol Arndt and Murtaza Hazara and Ali Ghadirzadeh and Ville Kyrki},
  journal={2020 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2020},
  pages={2725-2731}
}
Modern reinforcement learning methods suffer from low sample efficiency and unsafe exploration, making it infeasible to train robotic policies entirely on real hardware. In this work, we propose to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration. We evaluate the method by performing… Expand
Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms
TLDR
The proposed method can successfully adapt a trained policy to different robotic platforms with novel physical parameters and the superiority of the metalearning algorithm compared to state-of-the-art methods for the introduced few-shot policy adaptation problem is demonstrated. Expand
Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey
TLDR
The fundamental background behind sim-to-real transfer in deep reinforcement learning is covered and the main methods being utilized at the moment: domain randomization, domain adaptation, imitation learning, meta-learning and knowledge distillation are overviewed. Expand
Real Robot Challenge using Deep Reinforcement Learning
This paper details our winning submission to Phase 1 of the 2021 Real Robot Challenge, a challenge in which a three fingered robot must carry a cube along specified goal trajectories. To solve PhaseExpand
A Domain Data Pattern Randomization based Deep Reinforcement Learning method for Sim-to-Real transfer
TLDR
A memory-enhanced domain data pattern randomization method is proposed, which achieves data enhancement by randomizing the distribution pattern of data connection, at the same time, the memory mechanism based on recurrent neural network is introduced into the decision model, to alleviate the jitter of environmental distribution caused by data pattern changes. Expand
Lifelong Robotic Reinforcement Learning by Retaining Experiences
TLDR
This work study a practical sequential multi-task RL problem that is motivated by the practical constraints of physical robotic systems, and derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot’s skillset. Expand
Towards Closing the Sim-to-Real Gap in Collaborative Multi-Robot Deep Reinforcement Learning
TLDR
This work introduces the effect of sensing, calibration, and accuracy mismatches in distributed reinforcement learning with proximal policy optimization (PPO), and discusses on how both the different types of perturbance and how the number of agents experiencing those perturbances affect the collaborative learning effort. Expand
Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey
TLDR
An overview of how DRL and ESs can be used, either independently or in unison, to solve specific learning tasks is presented and is intended to guide researchers to select which method suits them best and provides a bird's eye view of the overall literature in the field. Expand
Challenges of Reinforcement Learning
TLDR
This chapter introduces the existing challenges in deep reinforcement learning research and applications and proposes the above challenges with potential solutions and research directions, as the primers of the advanced topics in the second main part of the book, including Chaps. Expand
Few-Shot Model-Based Adaptation in Noisy Conditions
TLDR
It is shown that the proposed method, which explicitly addresses domain noise, improves few-shot adaptation error over a blackbox adaptation LSTM baseline, and over a model-free on-policy reinforcement learning approach, which tries to learn an adaptable and informative policy at the same time. Expand
A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning
  • Shuhuan Wen, Zeteng Wen, Di Zhang, Hong Zhang, Tao Wang
  • Computer Science
  • Appl. Soft Comput.
  • 2021
TLDR
This paper proposes dynamic proximal meta policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PPO-CMA) based on original proximal policy optimization (PPO) to obtain a valid policy of obstacles avoidance to avoid obstacles and realize autonomous navigation. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 36 REFERENCES
Transferring Generalizable Motor Primitives From Simulation to Real World
TLDR
A novel sample-efficient transfer approach, which is agnostic to the dynamics of a simulated system and combines it with incremental learning, which transfers a generalizable contextual policy generated in simulation using one or few samples from real world to a target global model, which can generate policies across parameterized real-world situations. Expand
Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
TLDR
This work uses meta-learning to train a dynamics model prior such that, when combined with recent data, this prior can be rapidly adapted to the local context and demonstrates the importance of incorporating online adaptation into autonomous agents that operate in the real world. Expand
Mutual Alignment Transfer Learning
TLDR
It is demonstrated empirically that the reciprocal alignment for both agents provides further benefit as the agent in simulation can adjust to optimize its behaviour for states commonly visited by the real-world agent. Expand
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
TLDR
This paper develops an off-policy meta-RL algorithm that disentangles task inference and control and performs online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience. Expand
Model-Based Reinforcement Learning via Meta-Policy Optimization
TLDR
This work proposes Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models and uses an ensemble of learned dynamic models to create a policy that can quickly adapt to any model in the ensemble with one policy gradient step. Expand
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
TLDR
By randomizing the dynamics of the simulator during training, this paper is able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained. Expand
Deep predictive policy training using reinforcement learning
TLDR
A data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations and is demonstrated by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. Expand
End-to-End Training of Deep Visuomotor Policies
TLDR
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method. Expand
GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning
TLDR
This work proposes using GPU-accelerated RL simulations as an alternative to CPU ones for speeding up Deep RL training, and shows promising speed-ups of learning various continuous-control, locomotion tasks. Expand
Sim-to-Real: Learning Agile Locomotion For Quadruped Robots
TLDR
This system can learn quadruped locomotion from scratch using simple reward signals and users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. Expand
...
1
2
3
4
...