• Publications
  • Influence
Reinforcement Learning: An Introduction
TLDR
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Introduction to Reinforcement Learning
TLDR
In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.
Neuronlike adaptive elements that can solve difficult learning control problems
TLDR
It is shown how a system consisting of two neuronlike adaptive elements can solve a difficult learning control problem and the relation of this work to classical and instrumental conditioning in animal learning studies and its possible implications for research in the neurosciences.
Reinforcement learning
Learning to Act Using Real-Time Dynamic Programming
TLDR
An algorithm based on dynamic programming, which is called Real-Time DP, is introduced, by which an embedded system can improve its performance with experience and illuminate aspects of other DP-based reinforcement learning methods such as Watkins'' Q-Learning algorithm.
Linear Least-Squares algorithms for temporal difference learning
We introduce two new temporal diffence (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove
Recent Advances in Hierarchical Reinforcement Learning
TLDR
This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.
Adaptive Critics and the Basal Ganglia
One of the most active areas of research in artificial intelligence is the study of learning methods by which “embedded agents” can improve performance while acting in complex dynamic environments.
Linear Least-Squares Algorithms for Temporal Difference Learning
TLDR
Two new temporal difference algorithms based on the theory of linear least-squares function approximation, LS TD and RLS TD, are introduced and prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters.
...
1
2
3
4
5
...