• Publications
  • Influence
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
TLDR
A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation.
Progressive Neural Networks
TLDR
This work evaluates this progressive networks architecture extensively on a wide variety of reinforcement learning tasks, and demonstrates that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
Learning to reinforcement learn
TLDR
This work introduces a novel approach to deep meta-reinforcement learning, which is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure.
Learning to Navigate in Complex Environments
TLDR
This work considers jointly learning the goal-driven reinforcement learning problem with auxiliary depth prediction and loop closure classification tasks and shows that data efficiency and task performance can be dramatically improved by relying on additional auxiliary tasks leveraging multimodal sensory inputs.
Grounded Language Learning in a Simulated 3D World
TLDR
An agent is presented that learns to interpret language in a simulated 3D environment where it is rewarded for the successful execution of written instructions and its comprehension of language extends beyond its prior experience, enabling it to apply familiar language to unfamiliar situations and to interpret entirely novel instructions.
Vector-based navigation using grid-like representations in artificial agents
TLDR
These findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation, and support neuroscientific theories that see grid cells as critical for vector-based navigation.
Prefrontal cortex as a meta-reinforcement learning system
TLDR
A new theory is presented showing how learning to learn may arise from interactions between prefrontal cortex and the dopamine system, providing a fresh foundation for future research.
Multi-task Deep Reinforcement Learning with PopArt
TLDR
This work proposes to automatically adapt the contribution of each task to the agent’s updates, so that all tasks have a similar impact on the learning dynamics, and learns a single trained policy that exceeds median human performance on this multi-task domain.
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
TLDR
V-MPO is introduced, an on-policy adaptation of Maximum a Posteriori Policy Optimization that performs policy iteration based on a learned state-value function and does so reliably without importance weighting, entropy regularization, or population-based tuning of hyperparameters.
Leveraging Monolingual Data for Crosslingual Compositional Word Representations
TLDR
This work presents a novel neural network based architecture for inducing compositional crosslingual word representations that constrains the word-level representations to be compositional, and is capable of leveraging both bilingual and monolingual data, which is scalable to large vocabularies and large quantities of data.
...
1
2
3
...