This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.Expand

In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.Expand

It is shown how a system consisting of two neuronlike adaptive elements can solve a difficult learning control problem and the relation of this work to classical and instrumental conditioning in animal learning studies and its possible implications for research in the neurosciences.Expand

An algorithm based on dynamic programming, which is called Real-Time DP, is introduced, by which an embedded system can improve its performance with experience and illuminate aspects of other DP-based reinforcement learning methods such as Watkins'' Q-Learning algorithm.Expand

We introduce two new temporal diffence (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove… Expand

This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.Expand

One of the most active areas of research in artificial intelligence is the study of learning methods by which “embedded agents” can improve performance while acting in complex dynamic environments.… Expand

Two new temporal difference algorithms based on the theory of linear least-squares function approximation, LS TD and RLS TD, are introduced and prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters.Expand