Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms

@article{Ghadirzadeh2021BayesianMF,
  title={Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms},
  author={Ali Ghadirzadeh and Xi Chen and Petra Poklukar and Chelsea Finn and M{\aa}rten Bj{\"o}rkman and Danica Kragic},
  journal={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2021},
  pages={1274-1280}
}
  • Ali Ghadirzadeh, Xi Chen, D. Kragic
  • Published 5 March 2021
  • Computer Science
  • 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform. A policy trained with expensive data is rendered useless after making even a minor change to the robot hardware. In this paper, we address the challenging problem of adapting a policy, trained to perform a task, to a novel robotic hardware platform given only few demonstrations of robot motion trajectories on the target robot. We formulate it as… 

Figures from this paper

Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning

TLDR
This work proposes to leverage latent-variable generative model to represent high-advantage state-action pairs leading to better adherence to data distributions that contributes to solving the task, while maximizing reward via a policy over the latent variable.

Latent-Variable Advantage-Weighted Policy Optimization for Offline RL

TLDR
This work proposes to leverage latentvariable policies that can represent a broader class of policy distributions, leading to better adherence to the training data distribution while maximizing reward via a policy over the latent variable.

MetaMorph: Learning Universal Controllers with Transformers

TLDR
This work proposes MetaMorph, a Transformer based approach to learn a universal controller over a modular robot design space based on the insight that robot morphology is just another modality on which the authors can condition the output of a Trans transformer.

Meta Reinforcement Learning Based Sensor Scanning in 3D Uncertain Environments for Heterogeneous Multi-Robot Systems

TLDR
The experimental results demonstrate the meta-learning approach can outperform other methods by approximately 15%-27% on success rate and 70%-75% on adaptation speed.

Skeletal Feature Compensation for Imitation Learning with Embodiment Mismatch

TLDR
The proposed imitation learning technique, SILEM (Skeletal feature compensation for Imitation Learning with Embodiment Mismatch), addresses a particular type of embodiment mismatch by introducing a learned affine transform to compensate for differences in the skeletal features obtained from the learner and expert.

Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review

TLDR
The global generic learning network – an amalgamation of meta learning, transfer learning, and multi-task learning – is introduced here, along with some open research questions and future research directions in the multi- task setting.

Reimagining an autonomous vehicle

TLDR
It is argued that a rethink is required, reconsidering the autonomous vehicle problem in the light of the body of knowledge that has been gained since the DARPA challenges, and an alternative vision is presented: a recipe for driving with machine learning, and grand challenges for research in driving.

Calibration of Few-Shot Classification Tasks: Mitigating Misconfidence from Distribution Mismatch

TLDR
This study proposes a novel meta-training method that measures the distribution mismatch and enables the model to predict with more precise confidence, and shows that the training strategy prevents the model from becoming indiscriminately confident, and thereby helps themodel to produce calibrated classification results without the loss of accuracy.

Meta-Residual Policy Learning: Zero-Trial Robot Skill Adaptation via Knowledge Fusion

TLDR
Meta-Residual Policy Learning (MRPL) is proposed to reduce the cost of policy learning and adaptation and outperforms prior methods in robot skill adaptation.

References

SHOWING 1-10 OF 49 REFERENCES

Probabilistic Active Meta-Learning

TLDR
This work introduces task selection based on prior experience into a meta-learning algorithm by conceptualizing the learner and the active meta- learning setting using a probabilistic latent variable model and provides empirical evidence that this approach improves data-efficiency when compared to strong baselines on simulated robotic experiments.

Meta Reinforcement Learning for Sim-to-real Domain Adaptation

TLDR
This work proposes to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration.

Learning modular neural network policies for multi-task and multi-robot transfer

TLDR
The effectiveness of the transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks is demonstrated.

One-Shot Visual Imitation Learning via Meta-Learning

TLDR
A meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration, and requires data from significantly fewer prior tasks for effective learning of new skills.

Deep visual foresight for planning robot motion

  • Chelsea FinnS. Levine
  • Computer Science
    2017 IEEE International Conference on Robotics and Automation (ICRA)
  • 2017
TLDR
This work develops a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data and enables a real robot to perform nonprehensile manipulation — pushing objects — and can handle novel objects not seen during training.

Probabilistic Model-Agnostic Meta-Learning

TLDR
This paper proposes a probabilistic meta-learning algorithm that can sample models for a new task from a model distribution that is trained via a variational lower bound, and shows how reasoning about ambiguity can also be used for downstream active learning problems.

One-Shot Imitation Learning

TLDR
A meta-learning framework for achieving one-shot imitation learning, where ideally, robots should be able to learn from very few demonstrations of any given task, and instantly generalize to new situations of the same task, without requiring task-specific engineering.

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

TLDR
This work presents an approach for one-shot learning from a video of a human by using human and robot demonstration data from a variety of previous tasks to build up prior knowledge through meta-learning, then combining this prior knowledge and only a single video demonstration from a human, the robot can perform the task that the human demonstrated.

Deep predictive policy training using reinforcement learning

TLDR
A data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations and is demonstrated by training predictive policies for skilled object grasping and ball throwing on a PR2 robot.