Retrieval-Augmented Reinforcement Learning
@article{Goyal2022RetrievalAugmentedRL,
title={Retrieval-Augmented Reinforcement Learning},
author={Anirudh Goyal and Abram L. Friesen and Andrea Banino and Th{\'e}ophane Weber and Nan Rosemary Ke and Adri{\`a} Puigdom{\`e}nech Badia and Arthur Guez and Mehdi Mirza and Ksenia Konyushkova and Michal Valko and Simon Osindero and Timothy P. Lillicrap and Nicolas Manfred Otto Heess and Charles Blundell},
journal={ArXiv},
year={2022},
volume={abs/2202.08417}
}Most deep reinforcement learning (RL) algorithms distill experience into parametric behavior policies or value functions via gradient updates. While effective, this approach has several disadvantages: (1) it is computationally expensive, (2) it can take many updates to integrate experiences into the parametric model, (3) experiences that are not fully integrated do not appropriately influence the agent's behavior, and (4) behavior is limited by the capacity of the model. In this paper we…
Figures and Tables from this paper
19 Citations
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
- 2023
Computer Science
ArXiv
Empirical evidence and theoretical analysis show that PRDC can alleviate offline RL's fundamentally challenging value overestimation issue with a bounded performance gap and on a set of locomotion and navigation tasks, PRDC achieves state-of-the-art performance compared with existing methods.
Augmented Modular Reinforcement Learning based on Heterogeneous Knowledge
- 2023
Computer Science
ArXiv
This work proposes Augmented Modular Reinforcement Learning (AMRL), a new framework that uses an arbitrator to select heterogeneous modules and seamlessly incorporate different types of knowledge, and introduces a variation of the selection mechanism, namely the Memory-Augmented Arbitrator, which adds the capability of exploiting temporal information.
Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases
- 2023
Computer Science
ArXiv
The Chain of Knowledge framework is introduced, a framework that augments large language models with structured knowledge bases to improve factual correctness and reduce hallucination and a query generator model with contrastive instruction-tuning to assist largelanguage models to effectively query knowledge bases.
Domain Adaptation with External Off-Policy Acoustic Catalogs for Scalable Contextual End-to-End Automated Speech Recognition
- 2023
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Retrieved Sequence Augmentation for Protein Representation Learning
- 2023
Computer Science, Biology
bioRxiv
This study introduces a novel paradigm called Retrieved Sequence Augmentation (RSA) that enhances protein representation learning without necessitating additional alignment or preprocessing, and demonstrates that protein language models benefit from retrieval enhancement in both structural and property prediction tasks.
Complex QA and language models hybrid architectures, Survey
- 2023
Computer Science
ArXiv
This paper reviews the state-of-the-art of language models architectures and strategies for "complex"question-answering (QA, C QA, CPS) with a focus on hybridization, and discusses some challenges associated with complex QA.
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration
- 2023
Computer Science
ArXiv
In this version of the Symbolic Alchemy benchmark, the method's adaptation speed and exploration-exploitation balance approach those of an exact posterior sampling oracle and it is shown that even though partial models exclude relevant information from the environment, they can nevertheless lead to good policies.
REPLUG: Retrieval-Augmented Black-Box Language Models
- 2023
Computer Science
ArXiv
We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model. Unlike prior…
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition
- 2023
Computer Science
ArXiv
This work investigates the potential of leveraging external knowledge, particularly through off-policy key-value stores generated with text-to-speech methods, to allow for flexible post-training adaptation to new data distributions to adapt production ASR systems in challenging zero and few-shot scenarios.
A Task-Agnostic Regularizer for Diverse Subpolicy Discovery in Hierarchical Reinforcement Learning
- 2023
Computer Science
IEEE Transactions on Systems, Man, and Cybernetics: Systems
A task-agnostic regularizer for learning diverse subpolicies in HRL is proposed that can improve upon the state-of-the-art performance on all three HRL domains without modifying any existing hyperparameters, indicating the wide applicability and robustness of the approach.
94 References
Recurrent Experience Replay in Distributed Reinforcement Learning
- 2019
Computer Science
ICLR
The effects of parameter lag resulting in representational drift and recurrent state staleness are studied and an improved training strategy is empirically derived and the resulting agent, Recurrent Replay Distributed DQN, quadruples the previous state of the art on Atari-57, and matches the state-of-the art on DMLab-30.
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning
- 2021
Computer Science
ICLR
CausalWorld is proposed, a benchmark for causal structure and transfer learning in a robotic manipulation environment that is a simulation of an open-source robotic platform, hence offering the possibility of sim-to-real transfer.
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
- 2019
Computer Science
ICLR
The BabyAI research platform is introduced to support investigations towards including humans in the loop for grounded language learning and puts forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.
Long Short-Term Memory
- 1997
Computer Science
Neural Computation
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)
- 2013
Computer Science
IJCAI
The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- 2021
Computer Science
ICLR
Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
Language Models are Unsupervised Multitask Learners
- 2019
Computer Science
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Meta Learning Shared Hierarchies
- 2018
Computer Science
ICLR
A metalearning approach for learning hierarchically structured policies, improving sample efficiency on unseen tasks through the use of shared primitives---policies that are executed for large numbers of timesteps, and provides a concrete metric for measuring the strength of such hierarchies.
Attention is All you Need
- 2017
Computer Science
NIPS
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Stochastic Neural Networks for Hierarchical Reinforcement Learning
- 2017
Computer Science
ICLR
This work proposes a general framework that first learns useful skills in a pre-training environment, and then leverages the acquired skills for learning faster in downstream tasks, and uses Stochastic Neural Networks combined with an information-theoretic regularizer to efficiently pre-train a large span of skills.














