Reinforcement Learning: An Introduction
@article{Sutton2005ReinforcementLA, title={Reinforcement Learning: An Introduction}, author={Richard S. Sutton and Andrew G. Barto}, journal={IEEE Transactions on Neural Networks}, year={2005}, volume={16}, pages={285-286} }
Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. [] Key Method Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view…
Figures and Tables from this paper
figure 1.1 figure 2.1 figure 2.2 figure 2.3 figure 2.4 figure 2.5 figure 2.6 figure 3.1 figure 3.1 figure 3.2 figure 3.3 figure 3.4 figure 3.5 figure 3.6 figure 3.7 figure 3.8 figure 4.1 figure 4.2 figure 4.3 figure 5.1 figure 5.2 figure 5.3 figure 5.4 figure 5.5 figure 5.6 figure 6.1 figure 6.2 figure 6.3 figure 6.4 figure 6.5 figure 6.6 figure 6.7 figure 6.8 figure 7.1 figure 7.2 figure 7.3 figure 7.4 figure 7.5 figure 8.1 figure 8.11 figure 8.2 figure 8.3 figure 8.4 figure 8.5 figure 8.6 figure 8.7 figure 8.8 figure 8.9 figure 9.1 figure 9.10 figure 9.11 figure 9.12 figure 9.13 figure 9.14 figure 9.15 figure 9.2 figure 9.3 figure 9.4 figure 9.5 figure 9.6 figure 9.7 figure 9.8 figure 9.9 figure 10.1 figure 10.2 figure 10.3 figure 10.4 figure 10.5 figure 11.1 figure 11.2 figure 11.3 figure 12.1 figure 12.2 figure 12.3 figure 12.4 figure 12.5 figure 12.6 figure 12.7 figure 13.2 figure 14.1 figure 14.2 figure 14.7 figure 14.8 figure 14.9 figure 15.1 figure 15.2 figure 15.3 figure 15.4 figure 15.5 figure 15.6 figure 16.1 table 16.1 figure 16.11 figure 16.12 figure 16.16 figure 16.2 figure 16.3 figure 16.4 figure 16.6 figure 16.7 figure 16.8 figure 16.9 figure 17.1 figure 20 figure 21 figure 22 figure 23
36,695 Citations
Use of Reinforcement Learning as a Challenge: A Review
- Computer Science
- 2013
This paper discusses its basic model, the optimal policies used in RL, the main reinforcement optimal policy that are used to reward the agent including model free and model based policies, and some of the future research scope in Reinforcement Learning.
Algorithms for Reinforcement Learning
- Computer ScienceAlgorithms for Reinforcement Learning
- 2010
This book focuses on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming, and gives a fairly comprehensive catalog of learning problems, and describes the core ideas, followed by the discussion of their theoretical properties and limitations.
Control Optimization with Reinforcement Learning
- Computer Science
- 2015
This chapter focuses on a relatively new methodology called reinforcement learning (RL), a form of simulation-based dynamic programming, primarily used for solving Markov and semi-Markov decision problems.
Reinforcement Learning: A Technical Introduction – Part I
- Computer Science
- 2019
The paper offers an opinionated introduction in the algorithmic advantages and drawbacks of several algorithmic approaches to provide algorithmic design options.
Reinforcement Learning and Its Relationship to Supervised Learning
- Computer Science
- 2004
This chapter discusses stochastic sequential decision processes from the perspective of Machine Learning, focussing on reinforcement learning and its relationship to the more commmonly studied supervised learning problems.
Derivative-free reinforcement learning: a review
- Computer ScienceFrontiers of Computer Science
- 2021
Methods of derivative-free reinforcement learning to date are summarized, and the methods in aspects including parameter updating, model selection, exploration, and parallel/distributed methods are organized.
Online learning of shaping rewards in reinforcement learning
- Computer ScienceNeural Networks
- 2010
Influence Value Q-Learning: A Reinforcement Learning Algorithm for Multi Agent Systems 1
- Computer Science
- 2009
The use of multi-agent systems became popular in the solution of computacional problems like e-commerce, scheduling in transportation problems, estimation of energy demand, content based image retrieval, and others.
Opposition-Based Reinforcement Learning
- Computer ScienceJ. Adv. Comput. Intell. Intell. Informatics
- 2006
Opposition-based reinforcement learning, inspired by opposition-based learning, is introduced, to speed up convergence by Considering opposite actions simultaneously enables individual states to be updated more than once shortening exploration and expediting convergence.
Reinforcement Learning using Kohonen Feature Map Probabilistic Associative Memory based on Weights Distribution
- Computer Science
- 2010
This research proposes the reinforcement learning method using Kohonen Feature Map Probabilistic Associative Memory based on Weights Distribution (KFMPAM-WD)(Osana, 2009), based on the actor-critic method, and the actor is realized by the KFMP AM-WD.
References
SHOWING 1-10 OF 574 REFERENCES
Reinforcement Learning: A Survey
- PsychologyJ. Artif. Intell. Res.
- 1996
Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Self-improving reactive agents based on reinforcement learning, planning and teaching
- Computer ScienceMachine Learning
- 2004
This paper compares eight reinforcement learning frameworks: Adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning and two extensions are experience replay, learning action models for planning, and teaching.
Problem solving with reinforcement learning
- Computer Science
- 1995
This thesis is concerned with practical issues surrounding the application of reinforcement learning techniques to tasks that take place in high dimensional continuous state-space environments. In…
Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons
- Computer ScienceIJCAI
- 1991
This paper describes the input generalization problem (whereby the system must generalize to produce similar actions in similar situations) and an implemented solution, the G algorithm, which is based on recursive splitting of the state space based on statistical measures of differences in reinforcements received.
Importance sampling for reinforcement learning with multiple objectives
- Computer Science
- 2001
This thesis considers three complications that arise from applying reinforcement learning to a real-world application, and employs importance sampling (likelihood ratios) to achieve good performance in partially observable Markov decision processes with few data.
Adaptive Confidence and Adaptive Curiosity
- Computer Science
- 1991
This paper introduces ways for modelling the reliability of the outputs of adaptive predictors, and it describes more sophisticated and sometimes more ecient methods for their adaptive construction by on-line state space exploration.
Modular on-line function approximation for scaling up reinforcement learning
- Computer Science
- 1994
This dissertation extends existing ways of scaling up reinforcement learning methods and proposes several new approaches that can be used to enable reinforcement learning agents to acquire context-dependent evaluation functions and policies.
Gradient Descent for General Reinforcement Learning
- Computer ScienceNIPS
- 1998
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcement-learning algorithms, and allows policy-search and value-based algorithms to be combined, thus unifying two very different approaches to reinforcement learning into a single Value and Policy Search algorithm.
Adaptive Critics and the Basal Ganglia
- Computer Science
- 1995
One consequence of the embedded agent view is the increasing interest in the learning paradigm called reinforcement learning (RL).