#### Filter Results:

- Full text PDF available (178)

#### Publication Year

1974

2017

- This year (1)
- Last 5 years (37)
- Last 10 years (79)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Brain Region

#### Cell Type

#### Data Set Used

#### Key Phrases

#### Method

#### Organism

Learn More

- Richard S. Sutton, Andrew G. Barto
- Adaptive computation and machine learning
- 1998

The reinforcement learning (RL) problem is the challenge of artificial intelligence in a microcosm; how can we build an agent that can plan, learn, perceive, and act in a complex world? There’s a great new book on the market that lays out the conceptual and algorithmic foundations of this exciting area. RL pioneers Rich Sutton and Andy Barto have published… (More)

- Andrew G. Barto, Richard S. Sutton, Charles W. Anderson
- IEEE Trans. Systems, Man, and Cybernetics
- 1983

- Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh
- Artif. Intell.
- 1995

Learning methods based on dynamic programming (DP) are receiving increasing attention in arti cial intelligence. Researchers have argued that DP provides the appropriate basis for compiling planning results into reactive strategies for real-time control, as well as for learning such strategies when the system being controlled is incompletely known. We… (More)

- Steven J. Bradtke, Andrew G. Barto
- Machine Learning
- 1996

We introduce two new temporal diffence (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this… (More)

- Andrew G. Barto
- 1995

One of the most active areas of research in artificial intelligence is the study of learning methods by which “embedded agents” can improve performance while acting in complex dynamic environments. An agent, or decision maker, is embedded in an environment when it receives information from, and acts on, that environment in an ongoing closed-loop… (More)

- Andrew G. Barto, Sridhar Mahadevan
- Discrete Event Dynamic Systems
- 2003

Reinforcement learning is bedeviled by the curse of dimensionality: the number of parameters to be learned grows exponentially with the size of any compact encoding of a state. Recent attempts to combat the curse of dimensionality have turned to principled ways of exploiting temporal abstraction, where decisions are not required at each step, but rather… (More)

- Richard S. Sutton, Andrew G. Barto
- Psychological review
- 1981

Many adaptive neural network theories are based on neuronlike adaptive elements that can behave as single unit analogs of associative conditioning. In this article we develop a similar adaptive element, but one which is more closely in accord with the facts of animal learning theory than elements commonly studied in adaptive network research. We suggest… (More)

This chapter presents a model of classical conditioning called the temporal-difference (TD) model. The TD model was originally developed as a neuron-like unit for use in adaptive networks (Sutton and Barto 1987; Sutton 1984; Barto, Sutton and Anderson 1983). In this paper, however, we analyze it from the point of view of animal learning theory. Our intended… (More)

Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical… (More)

- Robert A. Jacobs, Michael I. Jordan, Andrew G. Barto
- Cognitive Science
- 1991

A novel modular connectionist architecture is presented in which the networks composing the architecture compete to learn the training patterns. An outcome of the competition is that di erent networks learn di erent training patterns and, thus, learn to compute di erent functions. The architecture performs task decomposition in the sense that it learns to… (More)