Author pages are created from data sourced from our academic publisher partnerships and public sources.

- Publications
- Influence

Reinforcement Learning: An Introduction

Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it… Expand

Learning to Act Using Real-Time Dynamic Programming

- A. Barto, Steven J. Bradtke, Satinder Singh
- Computer Science
- Artif. Intell.
- 1995

Learning methods based on dynamic programming (DP) are receiving increasing attention in artificial intelligence. Researchers have argued that DP provides the appropriate basis for compiling planning… Expand

Neuronlike adaptive elements that can solve difficult learning control problems

- A. Barto, R. Sutton, C. Anderson
- Computer Science, Psychology
- IEEE Transactions on Systems, Man, and…
- 1 September 1983

It is shown how a system consisting of two neuronlike adaptive elements can solve a difficult learning control problem. The task is to balance a pole that is hinged to a movable cart by applying… Expand

Adaptive critics and the basal ganglia.

- A. Barto
- Psychology
- 1995

One of the most active areas of research in artificial intelligence is the study of learning methods by which “embedded agents” can improve performance while acting in complex dynamic environments.… Expand

Linear Least-Squares Algorithms for Temporal Difference Learning

- Steven J. Bradtke, A. Barto
- Computer Science
- Machine Learning
- 1996

We introduce two new temporal difference (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove… Expand

Toward a modern theory of adaptive networks: expectation and prediction.

Many adaptive neural network theories are based on neuronlike adaptive elements that can behave as single unit analogs of associative conditioning. In this article we develop a similar adaptive… Expand

Linear Least-Squares algorithms for temporal difference learning

- Steven J. Bradtke, A. Barto
- Mathematics
- Machine Learning
- 2004

We introduce two new temporal diffence (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove… Expand

Recent Advances in Hierarchical Reinforcement Learning

- A. Barto, S. Mahadevan
- Computer Science
- Discret. Event Dyn. Syst.
- 2003

Reinforcement learning is bedeviled by the curse of dimensionality: the number of parameters to be learned grows exponentially with the size of any compact encoding of a state. Recent attempts to… Expand