• Publications
  • Influence
Human-level control through deep reinforcement learning
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks. Expand
The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)
The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed. Expand
A Distributional Perspective on Reinforcement Learning
This paper argues for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent, and designs a new algorithm which applies Bellman's equation to the learning of approximate value distributions. Expand
Unifying Count-Based Exploration and Intrinsic Motivation
This work uses density models to measure uncertainty, and proposes a novel algorithm for deriving a pseudo-count from an arbitrary density model, which enables this technique to generalize count-based exploration algorithms to the non-tabular case. Expand
Distributional Reinforcement Learning with Quantile Regression
A distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean is built, and a novel distributional reinforcement learning algorithm is presented consistent with the theoretical formulation. Expand
Safe and Efficient Off-Policy Reinforcement Learning
A novel algorithm, Retrace ($\lambda$), is derived, believed to be the first return-based off-policy control algorithm converging a.s. to $Q^*$ without the GLIE assumption (Greedy in the Limit with Infinite Exploration). Expand
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
A big picture look at how the Arcade Learning Environment is being used by the research community is taken, revisiting challenges posed when the ALE was introduced, summarizing the state-of-the-art in various problems and highlighting problems that remain open. Expand
An Introduction to Deep Reinforcement Learning
This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques and particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. Expand
Count-Based Exploration with Neural Density Models
Bellemare et al. (2016) introduced the notion of a pseudo-count, derived from a density model, to generalize count-based exploration to non-tabular reinforcement learning. This pseudo-count was usedExpand
The Cramer Distance as a Solution to Biased Wasserstein Gradients
This paper describes three natural properties of probability divergences that it believes reflect requirements from machine learning: sum invariance, scale sensitivity, and unbiased sample gradients and proposes an alternative to the Wasserstein metric, the Cramer distance, which possesses all three desired properties. Expand