# Reinforcement Learning and Its Relationship to Supervised Learning

@inproceedings{Si2004ReinforcementLA, title={Reinforcement Learning and Its Relationship to Supervised Learning}, author={Jennie Si and Andrew G. Barto and Warrren B Powell and Donald C. Wunsch}, year={2004} }

The modern study of approximate dynamic programming (DP) combines ideas from several research traditions. Among these is the field of Artificial Intelligence, whose earliest period focussed on creating artificial learning systems. Today, Machine Learning is an active branch of Artificial Intelligence (although it includes researchers from many other disciplines as well) devoted to continuing the development of artificial learning systems. Some of the problems studied in Machine Learning concernâ€¦Â

## 85 Citations

### Reinforcement Learning in Neural Networks: A Survey

- Computer Science
- 2014

This paper describes the state of the art of NNRL algorithms, with a focus on robotics applications and a comprehensive survey is started with a discussion on the concepts of RL.

### Understanding the Reinforcement Learning

- Computer ScienceJournal of Physics: Conference Series
- 2019

This paper talks about the reinforcement learning in the perspective of Markov Decision Process and Partially Observable Markov decision process, which are the core algorithms in reinforcement learning.

### The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains

- Computer Science
- 2008

This book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting, and a methodological translation is constructed from the propositional to the relational setting.

### AIXIjs: A Software Demo for General Reinforcement Learning

- Computer ScienceArXiv
- 2017

A AIXIjs, a JavaScript implementation of general reinforcement learning agents, accompanied by a framework for running experiments against various environments, and a suite of interactive demos that explore different properties of the agents, similar to REINFORCEjs (Karpathy, 2015).

### From Weighted Classification to Policy Search

- Computer ScienceNIPS
- 2005

The algorithm breaks a multistage reinforcement learning problem into a sequence of single-stage reinforcement learning subproblems, each of which is solved via an exact reduction to a weighted-classification problem that can be solved using off-the-self methods.

### Structural Return Maximization for Reinforcement Learning

- Computer ScienceArXiv
- 2014

This work focuses on learning policy classes that are appropriately sized to the amount of data available, using the principle of Structural Risk Minimization, from Statistical Learning Theory, which uses Rademacher complexity to identify a policy class that maximizes a bound on the return of the best policy in the chosen policy class, given the available data.

### Learning the Goal Seeking Behaviour for Mobile Robots

- Computer Science2018 3rd Asia-Pacific Conference on Intelligent Robot Systems (ACIRS)
- 2018

Experimental results show that reinforcement learning is more suited to solve the problem as the technique does not require a human expert to generate data, which is hence expensive, and compares two approaches, namely supervised learning and reinforcement learning using neural networks.

### Policy Search Based Relational Reinforcement Learning using the Cross-Entropy Method

- Computer Science
- 2013

This thesis describes an RRL algorithm named Cerrla that creates policies directly from a set of learned relational â€ścondition-actionâ€ť rules using the Cross-Entropy Method (CEM) to control policy creation.

### Reinforcement learning algorithms with function approximation: Recent advances and applications

- Computer ScienceInf. Sci.
- 2014

### N-Learning: A Reinforcement Learning Paradigm for Multiagent Systems

- Computer ScienceAustralian Conference on Artificial Intelligence
- 2005

N-learning is applied to a pursuit-evasion problem where a pursuer aims to calculate optimal policies for the interception of a deterministically moving evader, using an action selection component that can be realised through a number of techniques and a heuristic reinforcement learning reward function.

## References

SHOWING 1-10 OF 53 REFERENCES

### Reinforcement Learning: An Introduction

- Computer ScienceIEEE Transactions on Neural Networks
- 2005

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

### Reinforcement Learning as Classification: Leveraging Modern Classifiers

- Computer ScienceICML
- 2003

It is argued that the use of SVMs, particularly in combination with the kernel trick, can make it easier to apply reinforcement learning as an "out-of-the-box" technique, without extensive feature engineering.

### Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

- Computer ScienceNIPS
- 1995

It is concluded that reinforcement learning can work robustly in conjunction with function approximators, and that there is little justification at present for avoiding the case of general Î».

### Model-based Policy Gradient Reinforcement Learning

- Computer ScienceICML
- 2003

The paper describes an algorithm that alternates between pruning, exploration, and exploration to gather training data in the relevant parts of the state space, and gradient ascent search that uses training experiences much more efficiently.

### Steps toward Artificial Intelligence

- Computer ScienceProceedings of the IRE
- 1961

The discussion is supported by extensive citation of the literature and by descriptions of a few of the most successful heuristic (problem-solving) programs constructed to date.

### TD-Gammon: A Self-Teaching Backgammon Program

- Computer Science
- 1995

This chapter describes TD-Gammon, a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, and is apparently the first application of this algorithm to a complex nontrivial task.

### Machine learning

- Computer ScienceCSUR
- 1996

Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

### Least-Squares Temporal Difference Learning

- Computer ScienceICML
- 1999

This paper presents a simpler derivation of the LSTD algorithm, which generalizes from = 0 to arbitrary values of ; at the extreme of = 1, the resulting algorithm is shown to be a practical formulation of supervised linear regression.