Meta Reinforcement Learning for Sim-to-real Domain Adaptation

@article{Arndt2020MetaRL,
  title={Meta Reinforcement Learning for Sim-to-real Domain Adaptation},
  author={Karol Arndt and Murtaza Hazara and Ali Ghadirzadeh and Ville Kyrki},
  journal={2020 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2020},
  pages={2725-2731}
}
Modern reinforcement learning methods suffer from low sample efficiency and unsafe exploration, making it infeasible to train robotic policies entirely on real hardware. In this work, we propose to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration. We evaluate the method by performing… Expand
Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms
TLDR
The proposed method can successfully adapt a trained policy to different robotic platforms with novel physical parameters and the superiority of the metalearning algorithm compared to state-of-the-art methods for the introduced few-shot policy adaptation problem is demonstrated. Expand
Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey
TLDR
The fundamental background behind sim-to-real transfer in deep reinforcement learning is covered and the main methods being utilized at the moment: domain randomization, domain adaptation, imitation learning, meta-learning and knowledge distillation are overviewed. Expand
A Domain Data Pattern Randomization based Deep Reinforcement Learning method for Sim-to-Real transfer
  • Peng Gong, Dian-xi Shi, Chao Xue, Xucan Chen
  • Computer Science
  • ICIAI
  • 2021
Transferring reinforcement learning policies trained in a physical simulator to the real world is a highly challenging problem, because the gap between the simulation and reality, usually causes theExpand
Towards Closing the Sim-to-Real Gap in Collaborative Multi-Robot Deep Reinforcement Learning
TLDR
This work introduces the effect of sensing, calibration, and accuracy mismatches in distributed reinforcement learning with proximal policy optimization (PPO), and discusses on how both the different types of perturbance and how the number of agents experiencing those perturbances affect the collaborative learning effort. Expand
Challenges of Reinforcement Learning
TLDR
This chapter introduces the existing challenges in deep reinforcement learning research and applications and proposes the above challenges with potential solutions and research directions, as the primers of the advanced topics in the second main part of the book, including Chaps. Expand
Few-Shot Model-Based Adaptation in Noisy Conditions
TLDR
It is shown that the proposed method, which explicitly addresses domain noise, improves few-shot adaptation error over a blackbox adaptation LSTM baseline, and over a model-free on-policy reinforcement learning approach, which tries to learn an adaptable and informative policy at the same time. Expand
Crossing the Gap: A Deep Dive into Zero-Shot Sim-to-Real Transfer for Dynamics
TLDR
Surprisingly, it is found that a method which simply injects random forces into the simulation performs just as well as more complex methods, such as those which randomise the simulator’s dynamics parameters, or adapt a policy online using recurrent network architectures. Expand
Population-Based Evolution Optimizes a Meta-Learning Objective
TLDR
It is argued that population-based evolutionary systems with non-static fitness landscapes naturally bias towards high-evolvability genomes, and therefore optimize for populations with strong learning ability, and demonstrated with a simple evolutionary algorithm, Population-Based Meta Learning (PBML), that consistently discovers genomes. Expand
An adaptive deep reinforcement learning framework enables curling robots with human-like performance in real-world conditions
TLDR
A curling robot that can achieve human-level performance in the game of curling using an adaptive deep reinforcement learning framework is reported, indicating that the gap between physics-based simulators and the real world can be narrowed. Expand
Ubiquitous Distributed Deep Reinforcement Learning at the Edge: Analyzing Byzantine Agents in Discrete Action Spaces
TLDR
This paper discusses some of the challenges in multi-agent distributed deep reinforcement learning that can occur in the presence of byzantine or malfunctioning agents, and shows how wrong discrete actions can significantly affect the collaborative learning effort. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 36 REFERENCES
Transferring Generalizable Motor Primitives From Simulation to Real World
TLDR
A novel sample-efficient transfer approach, which is agnostic to the dynamics of a simulated system and combines it with incremental learning, which transfers a generalizable contextual policy generated in simulation using one or few samples from real world to a target global model, which can generate policies across parameterized real-world situations. Expand
Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
TLDR
This work uses meta-learning to train a dynamics model prior such that, when combined with recent data, this prior can be rapidly adapted to the local context and demonstrates the importance of incorporating online adaptation into autonomous agents that operate in the real world. Expand
Mutual Alignment Transfer Learning
TLDR
It is demonstrated empirically that the reciprocal alignment for both agents provides further benefit as the agent in simulation can adjust to optimize its behaviour for states commonly visited by the real-world agent. Expand
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
TLDR
This paper develops an off-policy meta-RL algorithm that disentangles task inference and control and performs online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience. Expand
Model-Based Reinforcement Learning via Meta-Policy Optimization
TLDR
This work proposes Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models and uses an ensemble of learned dynamic models to create a policy that can quickly adapt to any model in the ensemble with one policy gradient step. Expand
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
TLDR
By randomizing the dynamics of the simulator during training, this paper is able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained. Expand
Deep predictive policy training using reinforcement learning
TLDR
A data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations and is demonstrated by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. Expand
End-to-End Training of Deep Visuomotor Policies
TLDR
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method. Expand
GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning
TLDR
This work proposes using GPU-accelerated RL simulations as an alternative to CPU ones for speeding up Deep RL training, and shows promising speed-ups of learning various continuous-control, locomotion tasks. Expand
Sim-to-Real: Learning Agile Locomotion For Quadruped Robots
TLDR
This system can learn quadruped locomotion from scratch using simple reward signals and users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. Expand
...
1
2
3
4
...