Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Continuous control with deep reinforcement learning
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Asynchronous Methods for Deep Reinforcement Learning
A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Matching Networks for One Shot Learning
- Oriol Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, Daan Wierstra
- Computer ScienceNIPS
- 13 June 2016
This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.
Mastering the game of Go with deep neural networks and tree search
Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Mastering the game of Go without human knowledge
An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
A simple neural network module for relational reasoning
This work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations.
Learning Latent Dynamics for Planning from Pixels
The Deep Planning Network (PlaNet) is proposed, a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space using a latent dynamics model with both deterministic and stochastic transition components.
Dream to Control: Learning Behaviors by Latent Imagination
Dreamer is presented, a reinforcement learning agent that solves long-horizon tasks purely by latent imagination and efficiently learn behaviors by backpropagating analytic gradients of learned state values through trajectories imagined in the compact state space of a learned world model.
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
This paper generalizes the AlphaZero approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games, and convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go.
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
This paper generalises the approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains, and convincingly defeated a world-champion program in each case.