An Empirical Investigation of Representation Learning for Imitation

@article{Chen2021AnEI,
  title={An Empirical Investigation of Representation Learning for Imitation},
  author={Cynthia Chen and Xin Chen and Sam Toyer and Cody Wild and Scott Emmons and Ian S. Fischer and Kuang-Huei Lee and Neel Alex and Steven H. Wang and Ping Luo and Stuart J. Russell and P. Abbeel and Rohin Shah},
  journal={ArXiv},
  year={2021},
  volume={abs/2205.07886}
}
Imitation learning often needs a large demonstration set in order to handle the full range of situations that an agent might find itself in during deployment. How-ever, collecting expert demonstrations can be expensive. Recent work in vision, reinforcement learning, and NLP has shown that auxiliary representation learning objectives can reduce the need for large amounts of expensive, task-specific data. Our Empirical Investigation of Representation Learning for Imitation (EIRLI) investigates… 

Figures and Tables from this paper

Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?
TLDR
It is suggested that no single self-supervised loss or image augmentation method can dominate all environments and that the current framework for joint optimization of SSL and RL is limited.
Replay Buffer start Strategy a ) Strategy b ) EncoderEncoder Encoder Encoder EncoderGoal Goal DBTaskDemonstrationsif successful Online Goal Selection
TLDR
This work extends hindsight relabelling mechanisms to guide exploration along task-specific distributions implied by a small set of successful demonstrations, and achieves a significantly higher overall performance as task complexity increases.
VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path
  • Computer Science
  • 2022
TLDR
A practical and generalizable Decision-Aware Model-Based Reinforcement Learning algorithm to improve the generalization of VAML-like model learning and shows theoretically for linear and tabular spaces that the proposed algorithm is sensible, justifying extension to non-linear and continuous spaces.
PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations
TLDR
Predictive Information Augmented Random Search (PI-ARS) is developed, which combines a gradient-based representation learning technique, Predictive Information (PI), with an gradient-free ES algorithm, Augmented random Search (ARS), to train policies that can process complex robot sensory inputs and handle highly nonlinear robot dynamics.

References

SHOWING 1-10 OF 38 REFERENCES
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
TLDR
This work proposes a simple alternative that still uses RL, but does not require learning a reward function, and can be implemented with a handful of minor modifications to any standard Q-learning or off-policy actor-critic algorithm, called soft Q imitation learning (SQIL).
Decoupling Representation Learning from Reinforcement Learning
TLDR
A new unsupervised learning task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations separated by a short time difference, under image augmentations and using a contrastive loss.
Data-Efficient Reinforcement Learning with Momentum Predictive Representations
TLDR
This work trains an agent to predict its own latent state representations multiple steps into the future using an encoder which is an exponential moving average of the agent's parameters, and makes predictions using a learned transition model.
The MAGICAL Benchmark for Robust Imitation
TLDR
Using the MAGICAL suite, it is confirmed that existing IL algorithms overfit significantly to the context in which demonstrations are provided, and it is suggested that new approaches will be needed in order to robustly generalise demonstrator intent.
Generative Adversarial Imitation Learning
TLDR
A new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning, is proposed and a certain instantiation of this framework draws an analogy between imitation learning and generative adversarial networks.
Task-Relevant Adversarial Imitation Learning
TLDR
This work proposes a solution to a critical problem in adversarial imitation, Task-Relevant Adversarial Imitation Learning (TRAIL), which uses a constrained optimization objective to overcome task-irrelevant features.
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Learning a good representation is an essential component for deep reinforcement learning (RL). Representation learning is especially important in multitask and partially observable settings where
Deep Variational Reinforcement Learning for POMDPs
TLDR
Deep variational reinforcement learning (DVRL) is proposed, which introduces an inductive bias that allows an agent to learn a generative model of the environment and perform inference in that model to effectively aggregate the available information.
Learning Representations in Reinforcement Learning:An Information Bottleneck Approach
TLDR
This paper analytically derive the optimal conditional distribution of the representation, and provides a variational lower bound, and theoretically derive an algorithm to optimize the information bottleneck framework without constructing the lower bound.
Reinforcement Learning with Augmented Data
TLDR
It is shown that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods across common benchmarks.
...
...