# Linear Representation Meta-Reinforcement Learning for Instant Adaptation

@article{Peng2021LinearRM, title={Linear Representation Meta-Reinforcement Learning for Instant Adaptation}, author={Matt Peng and Banghua Zhu and Jiantao Jiao}, journal={ArXiv}, year={2021}, volume={abs/2101.04750} }

This paper introduces Fast Linearized Adaptive Policy (FLAP), a new metareinforcement learning (meta-RL) method that is able to extrapolate well to outof-distribution tasks without the need to reuse data from training, and adapt almost instantaneously with the need of only a few samples during testing. FLAP builds upon the idea of learning a shared linear representation of the policy so that when adapting to a new task, it suffices to predict a set of linear weights. A separate adapter network… Expand

#### Figures and Tables from this paper

#### 2 Citations

Improving Generalization in Meta-RL with Imaginary Tasks from Latent Dynamics Mixture

- Computer Science
- ArXiv
- 2021

LDM is proposed that trains a reinforcement learning agent with imaginary tasks generated from mixtures of learned latent dynamics that significantly outperforms standard meta-RL methods in test returns on the gridworld navigation and MuJoCo tasks where the authors strictly separate the training task distribution and the test task distribution. Expand

Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment

- Computer Science
- ICML
- 2021

This paper augments a learned dynamics model with simple transformations that seek to capture potential changes in physical properties of the robot, leading to more robust policies. Expand

#### References

SHOWING 1-10 OF 43 REFERENCES

Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling

- Computer Science, Mathematics
- ArXiv
- 2020

This work presents model identification and experience relabeling (MIER), a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time. Expand

Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables

- Computer Science, Mathematics
- ICML
- 2019

This paper develops an off-policy meta-RL algorithm that disentangles task inference and control and performs online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience. Expand

Continuous control with deep reinforcement learning

- Computer Science, Mathematics
- ICLR
- 2016

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs. Expand

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

- Computer Science
- ICML
- 2017

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning… Expand

Successor Features for Transfer in Reinforcement Learning

- Computer Science
- NIPS
- 2017

This work proposes a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same, and derives two theorems that set the approach in firm theoretical ground and presents experiments that show that it successfully promotes transfer in practice. Expand

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

- Computer Science, Mathematics
- ICML
- 2018

This paper proposes soft actor-critic, an off-policy actor-Critic deep RL algorithm based on the maximum entropy reinforcement learning framework, and achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off- policy methods. Expand

Benchmarking Deep Reinforcement Learning for Continuous Control

- Computer Science, Mathematics
- ICML
- 2016

This work presents a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, task with partial observations, and tasks with hierarchical structure. Expand

Meta-Reinforcement Learning of Structured Exploration Strategies

- Computer Science, Mathematics
- NeurIPS
- 2018

This work introduces a novel gradient-based fast adaptation algorithm -- model agnostic exploration with structured noise (MAESN) -- to learn exploration strategies from prior experience that are informed by prior knowledge and are more effective than random action-space noise. Expand

End-to-End Training of Deep Visuomotor Policies

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2016

This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method. Expand

A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning

- Computer Science
- Found. Trends Mach. Learn.
- 2013

This article describes algorithms in a unified framework, giving pseudocode together with memory and iteration complexity analysis for each, and empirical evaluations of these techniques with four representations across four domains provide insight into how these algorithms perform with various feature sets in terms of running time and performance. Expand