• Corpus ID: 195886344

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning

@article{Landolfi2019AMA,
  title={A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning},
  author={Nicholas C. Landolfi and Garrett Thomas and Tengyu Ma},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.04964}
}
The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks. In this work, the tasks correspond to reward functions for environments with the same (or similar) dynamical models. We propose to learn a dynamical model during the training process and use this model to perform sample-efficient adaptation to new tasks at test time. We use significantly fewer samples by… 

Figures from this paper

Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP
TLDR
This work uses assumptions of structure from Hidden-Parameter and Block MDPs to propose a new framework, HiP-BMDP, and approach for learning a common representation and universal dynamics model, and provides transfer and generalization bounds based on task and state similarity and sample complexity bounds that depend on the aggregate number of samples across tasks.
Model-based Adversarial Meta-Reinforcement Learning
TLDR
This paper proposes Model-based Adversarial Meta-Reinforcement Learning (AdMRL), where it aims to minimize the worst-case sub-optimality gap -- the difference between the optimal return and the return that the algorithm achieves after adaptation -- across all tasks in a family of tasks, with a model-based approach.
Fractional Transfer Learning for Deep Model-Based Reinforcement Learning
TLDR
Fractional transfer learning is presented, the idea is to transfer fractions of knowledge, opposed to discarding potentially useful knowledge as is commonly done with random initialization, using the World Model-based Dreamer algorithm.
HIDDEN-PARAMETER BLOCK MDPS
TLDR
This work derives instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings, and provides transfer and generalization bounds based on task and state similarity, along with sample complexity bounds that depend on the aggregate number of samples across tasks, rather than the number of tasks.
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
TLDR
A heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical MDPs, is formulated with a goal of improving their collective performance through inter-player information sharing.
On The Transferability of Deep-Q Networks
TLDR
The results show that transferring neural networks in a DRL context can be particularly challenging and is a process which in most cases results in negative transfer, and in the attempt of understanding why Deep-Q Networks transfer so poorly, novel insights are gained into the training dynamics that characterizes this family of algorithms.
Robot in a China Shop: Using Reinforcement Learning for Location-Specific Navigation Behaviour
TLDR
This paper proposes a new approach to navigation, where it is treated as a multi-task learning problem, which enables the robot to learn to behave differently in visual navigation tasks for different environments while also learning shared expertise across environments.
UNIFY CONTINUAL LEARNING RESEARCH
TLDR
A taxonomy of settings is proposed, where each setting is described as a set of assumptions and a treeshaped hierarchy emerges from this view, where more general settings become the parents of those with more restrictive assumptions.
Sequoia: A Software Framework to Unify Continual Learning Research
TLDR
A taxonomy of settings is proposed, where each setting is described as a set of assumptions and a treeshaped hierarchy emerges from this view, where more general settings become the parents of those with more restrictive assumptions.
...
...

References

SHOWING 1-10 OF 41 REFERENCES
Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
TLDR
This work uses meta-learning to train a dynamics model prior such that, when combined with recent data, this prior can be rapidly adapted to the local context and demonstrates the importance of incorporating online adaptation into autonomous agents that operate in the real world.
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
TLDR
This paper proposes a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation, which matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples.
Model-Based Reinforcement Learning for Atari
TLDR
Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models, is described and a comparison of several model architectures is presented, including a novel architecture that yields the best results in the authors' setting.
Multi-task reinforcement learning: a hierarchical Bayesian approach
TLDR
This work considers the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes chosen randomly from a fixed but unknown distribution, using a hierarchical Bayesian infinite mixture model.
Bayesian Multi-Task Reinforcement Learning
TLDR
This work considers the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any given policy, and adopts the Gaussian process temporal-difference value function model and uses a hierarchical Bayesian approach to model the distribution over the value functions.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
TLDR
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.
Using inaccurate models in reinforcement learning
TLDR
This paper presents a hybrid algorithm that requires only an approximate model, and only a small number of real-life trials, and achieves near-optimal performance in the real system, even when the model is only approximate.
Model-Ensemble Trust-Region Policy Optimization
TLDR
This paper analyzes the behavior of vanilla model-based reinforcement learning methods when deep neural networks are used to learn both the model and the policy, and shows that the learned policy tends to exploit regions where insufficient data is available for the model to be learned, causing instability in training.
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
TLDR
It is demonstrated that neural network dynamics models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits that accomplish various complex locomotion tasks.
...
...