• Publications
  • Influence
Deep Reinforcement Learning with Double Q-Learning
TLDR
This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games. Expand
Dueling Network Architectures for Deep Reinforcement Learning
TLDR
This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain. Expand
Rainbow: Combining Improvements in Deep Reinforcement Learning
TLDR
This paper examines six extensions to the DQN algorithm and empirically studies their combination, showing that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance. Expand
Double Q-learning
TLDR
An alternative way to approximate the maximum expected value for any set of random variables is introduced and the obtained double estimator method is shown to sometimes underestimate rather than overestimate themaximum expected value. Expand
StarCraft II: A New Challenge for Reinforcement Learning
TLDR
This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game that offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures and gives initial baseline results for neural networks trained from this data to predict game outcomes and player actions. Expand
Distributed Prioritized Experience Replay
TLDR
This work proposes a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible, and substantially improves the state of the art on the Arcade Learning Environment. Expand
Deep Reinforcement Learning in Large Discrete Action Spaces
TLDR
This paper leverages prior information about the actions to embed them in a continuous space upon which it can generalize, and uses approximate nearest-neighbor methods to allow reinforcement learning methods to be applied to large-scale learning problems previously intractable with current methods. Expand
Successor Features for Transfer in Reinforcement Learning
TLDR
This work proposes a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same, and derives two theorems that set the approach in firm theoretical ground and presents experiments that show that it successfully promotes transfer in practice. Expand
A theoretical and empirical analysis of Expected Sarsa
TLDR
It is proved that Expected Sarsa converges under the same conditions as SARSa and formulate specific hypotheses about when ExpectedSarsa will outperform SarsA and Q-learning, and it is demonstrated that Ex expected sarsa has significant advantages over these more commonly used methods. Expand
Meta-Gradient Reinforcement Learning
TLDR
A gradient-based meta-learning algorithm is discussed that is able to adapt the nature of the return, online, whilst interacting and learning from the environment and achieved a new state-of-the-art performance. Expand
...
1
2
3
4
5
...