Share This Author
Continuous control with deep reinforcement learning
- T. Lillicrap, Jonathan J. Hunt, Daan Wierstra
- Computer ScienceInternational Conference on Learning…
- 9 September 2015
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Successor Features for Transfer in Reinforcement Learning
This work proposes a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same, and derives two theorems that set the approach in firm theoretical ground and presents experiments that show that it successfully promotes transfer in practice.
Deep Reinforcement Learning in Large Discrete Action Spaces
This paper leverages prior information about the actions to embed them in a continuous space upon which it can generalize, and uses approximate nearest-neighbor methods to allow reinforcement learning methods to be applied to large-scale learning problems previously intractable with current methods.
Memory-based control with recurrent neural networks
This work extends two related, model-free algorithms for continuous control to solve partially observed domains using recurrent neural networks trained with backpropagation through time to find that recurrent deterministic and stochastic policies are able to learn similarly good solutions to these tasks, including the water maze where the agent must learn effective search strategies.
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
This work presents an end-to-end differentiable memory access scheme, which they call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories, and achieves asymptotic lower bounds in space and time complexity.
The Option Keyboard: Combining Skills in Reinforcement Learning
- André Barreto, Diana Borsa, Doina Precup
- Computer ScienceNeural Information Processing Systems
- 24 June 2021
It is shown that every deterministic option can be unambiguously represented as a cumulant defined in an extended domain, which means that, once the authors have learned options associated with a set of cumulants, they can instantaneously synthesise options induced by any linear combination of them, without any learning involved.
Sparse Coding Can Predict Primary Visual Cortex Receptive Field Changes Induced by Abnormal Visual Input
In every condition, the changes in receptive field properties previously observed experimentally were matched to a similar and highly faithful degree by all the models, suggesting that early sensory development can indeed be understood in terms of an impetus towards sparsity.
Composing Entropic Policies using Divergence Correction
- Jonathan J. Hunt, André Barreto, T. Lillicrap, N. Heess
- Computer ScienceInternational Conference on Machine Learning
- 27 September 2018
An important generalization of policy improvement to the maximum entropy framework is extended and an algorithm for the practical implementation of successor features in continuous action spaces is introduced and a novel approach is proposed which addresses the failure cases of prior work and recovers the optimal policy during transfer.
The Combinatorics of Neurite Self-Avoidance
Close-form solutions for the general case of neural development in Drosophila reveal the relationships among the key variables and how these constrain possible biological scenarios.
Statistical structure of lateral connections in the primary visual cortex
The hypothesis that long-range excitatory lateral connections in the primary visual cortex, which are believed to be involved in contour grouping, display a similar co-circular structure is tested.