Reinforcement Learning: An Introduction

@article{Sutton2005ReinforcementLA,
  title={Reinforcement Learning: An Introduction},
  author={Richard S. Sutton and Andrew G. Barto},
  journal={IEEE Transactions on Neural Networks},
  year={2005},
  volume={16},
  pages={285-286}
}
  • R. Sutton, A. Barto
  • Published 2005
  • Computer Science
  • IEEE Transactions on Neural Networks
Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. [...] Key Method Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view…Expand
Use of Reinforcement Learning as a Challenge: A Review
TLDR
This paper discusses its basic model, the optimal policies used in RL, the main reinforcement optimal policy that are used to reward the agent including model free and model based policies, and some of the future research scope in Reinforcement Learning.
Algorithms for Reinforcement Learning
TLDR
This book focuses on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming, and gives a fairly comprehensive catalog of learning problems, and describes the core ideas, followed by the discussion of their theoretical properties and limitations.
Control Optimization with Reinforcement Learning
TLDR
This chapter focuses on a relatively new methodology called reinforcement learning (RL), a form of simulation-based dynamic programming, primarily used for solving Markov and semi-Markov decision problems.
Reinforcement Learning: A Technical Introduction – Part I
Reinforcement learning provides a cognitive science perspective to behavior and sequential decision making provided that reinforcement learning algorithms introduce a computational concept of agency
Reinforcement Learning and Its Relationship to Supervised Learning
TLDR
This chapter discusses stochastic sequential decision processes from the perspective of Machine Learning, focussing on reinforcement learning and its relationship to the more commmonly studied supervised learning problems.
Derivative-Free Reinforcement Learning: A Review
TLDR
Methods of derivative-free reinforcement learning to date are summarized, and the methods in aspects including parameter updating, model selection, exploration, and parallel/distributed methods are organized.
Online learning of shaping rewards in reinforcement learning
TLDR
This paper shows how, in the absence of knowledge to define the potential function manually, this function can be learned online in parallel with the actual reinforcement learning process.
Influence Value Q-Learning: A Reinforcement Learning Algorithm for Multi Agent Systems 1
TLDR
The use of multi-agent systems became popular in the solution of computacional problems like e-commerce, scheduling in transportation problems, estimation of energy demand, content based image retrieval, and others.
Opposition-Based Reinforcement Learning
  • H. Tizhoosh
  • Computer Science
    J. Adv. Comput. Intell. Intell. Informatics
  • 2006
TLDR
Opposition-based reinforcement learning, inspired by opposition-based learning, is introduced, to speed up convergence by Considering opposite actions simultaneously enables individual states to be updated more than once shortening exploration and expediting convergence.
Algorithms and Representations for Reinforcement Learning
TLDR
This thesis introduces a new class of Reinforcement Learning algorithms, which leverage the power of a statistical set of tools known as Gaussian Processes, and offers viable solutions to some of the major limitations of current Rein reinforcement Learning methods.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 641 REFERENCES
Problem solving with reinforcement learning
This thesis is concerned with practical issues surrounding the application of reinforcement learning techniques to tasks that take place in high dimensional continuous state-space environments. In
Self-improving reactive agents based on reinforcement learning, planning and teaching
TLDR
This paper compares eight reinforcement learning frameworks: Adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning and two extensions are experience replay, learning action models for planning, and teaching.
Reinforcement Learning: A Survey
TLDR
Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons
TLDR
This paper describes the input generalization problem (whereby the system must generalize to produce similar actions in similar situations) and an implemented solution, the G algorithm, which is based on recursive splitting of the state space based on statistical measures of differences in reinforcements received.
On the Computational Economics of Reinforcement Learning
TLDR
It is suggested that given a fixed amount of computational power available per control action, it may be better to use a direct reinforcement learning method augmented with indirect techniques than to devote all available resources to a computationally costly indirect method.
Importance sampling for reinforcement learning with multiple objectives
TLDR
This thesis considers three complications that arise from applying reinforcement learning to a real-world application, and employs importance sampling (likelihood ratios) to achieve good performance in partially observable Markov decision processes with few data.
Adaptive Confidence and Adaptive Curiosity
Much of the recent research on adaptive neuro-control and reinforcement learning focusses on systems with adaptivèworld models'. Previous approaches, however, do not address the problem of modelling
Modular on-line function approximation for scaling up reinforcement learning
TLDR
This dissertation extends existing ways of scaling up reinforcement learning methods and proposes several new approaches that can be used to enable reinforcement learning agents to acquire context-dependent evaluation functions and policies.
Gradient Descent for General Reinforcement Learning
TLDR
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcement-learning algorithms, and allows policy-search and value-based algorithms to be combined, thus unifying two very different approaches to reinforcement learning into a single Value and Policy Search algorithm.
Reinforcement Learning with a Hierarchy of Abstract Models
TLDR
Simulations on a set of compositionally-structured navigation tasks show that H-DYNA can learn to solve them faster than conventional RL algorithms, and the abstract models can be used to solve stochastic control tasks.
...
1
2
3
4
5
...