Reinforcement Learning and Its Relationship to Supervised Learning

  title={Reinforcement Learning and Its Relationship to Supervised Learning},
  author={Jennie Si and Andrew G. Barto and Warrren B Powell and Donald C. Wunsch},
The modern study of approximate dynamic programming (DP) combines ideas from several research traditions. Among these is the field of Artificial Intelligence, whose earliest period focussed on creating artificial learning systems. Today, Machine Learning is an active branch of Artificial Intelligence (although it includes researchers from many other disciplines as well) devoted to continuing the development of artificial learning systems. Some of the problems studied in Machine Learning concern… 

Reinforcement Learning in Neural Networks: A Survey

This paper describes the state of the art of NNRL algorithms, with a focus on robotics applications and a comprehensive survey is started with a discussion on the concepts of RL.

Understanding the Reinforcement Learning

  • Nuo Xu
  • Computer Science
    Journal of Physics: Conference Series
  • 2019
This paper talks about the reinforcement learning in the perspective of Markov Decision Process and Partially Observable Markov decision process, which are the core algorithms in reinforcement learning.

The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains

This book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting, and a methodological translation is constructed from the propositional to the relational setting.

AIXIjs: A Software Demo for General Reinforcement Learning

A AIXIjs, a JavaScript implementation of general reinforcement learning agents, accompanied by a framework for running experiments against various environments, and a suite of interactive demos that explore different properties of the agents, similar to REINFORCEjs (Karpathy, 2015).

From Weighted Classification to Policy Search

The algorithm breaks a multistage reinforcement learning problem into a sequence of single-stage reinforcement learning subproblems, each of which is solved via an exact reduction to a weighted-classification problem that can be solved using off-the-self methods.

Structural Return Maximization for Reinforcement Learning

This work focuses on learning policy classes that are appropriately sized to the amount of data available, using the principle of Structural Risk Minimization, from Statistical Learning Theory, which uses Rademacher complexity to identify a policy class that maximizes a bound on the return of the best policy in the chosen policy class, given the available data.

Learning the Goal Seeking Behaviour for Mobile Robots

Experimental results show that reinforcement learning is more suited to solve the problem as the technique does not require a human expert to generate data, which is hence expensive, and compares two approaches, namely supervised learning and reinforcement learning using neural networks.

Policy Search Based Relational Reinforcement Learning using the Cross-Entropy Method

This thesis describes an RRL algorithm named Cerrla that creates policies directly from a set of learned relational “condition-action” rules using the Cross-Entropy Method (CEM) to control policy creation.

N-Learning: A Reinforcement Learning Paradigm for Multiagent Systems

N-learning is applied to a pursuit-evasion problem where a pursuer aims to calculate optimal policies for the interception of a deterministically moving evader, using an action selection component that can be realised through a number of techniques and a heuristic reinforcement learning reward function.



Reinforcement Learning: An Introduction

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Reinforcement Learning as Classification: Leveraging Modern Classifiers

It is argued that the use of SVMs, particularly in combination with the kernel trick, can make it easier to apply reinforcement learning as an "out-of-the-box" technique, without extensive feature engineering.

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

It is concluded that reinforcement learning can work robustly in conjunction with function approximators, and that there is little justification at present for avoiding the case of general λ.

Model-based Policy Gradient Reinforcement Learning

The paper describes an algorithm that alternates between pruning, exploration, and exploration to gather training data in the relevant parts of the state space, and gradient ascent search that uses training experiences much more efficiently.

Steps toward Artificial Intelligence

  • M. Minsky
  • Computer Science
    Proceedings of the IRE
  • 1961
The discussion is supported by extensive citation of the literature and by descriptions of a few of the most successful heuristic (problem-solving) programs constructed to date.

TD-Gammon: A Self-Teaching Backgammon Program

This chapter describes TD-Gammon, a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, and is apparently the first application of this algorithm to a complex nontrivial task.

Machine learning

Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

Least-Squares Temporal Difference Learning

This paper presents a simpler derivation of the LSTD algorithm, which generalizes from = 0 to arbitrary values of ; at the extreme of = 1, the resulting algorithm is shown to be a practical formulation of supervised linear regression.