Simple statistical gradient-following algorithms for connectionist reinforcement learning

@article{Williams2004SimpleSG,
  title={Simple statistical gradient-following algorithms for connectionist reinforcement learning},
  author={Ronald J. Williams},
  journal={Machine Learning},
  year={2004},
  volume={8},
  pages={229-256}
}
This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. [...] Key Method Specific examples of such algorithms are presented, some of which bear a close relationship to certain existing algorithms while others are novel but potentially interesting in their own right. Also given are results that show how such algorithms can be naturally integrated with backpropagation. We close with a brief discussion of a number of additional…Expand
Local online learning in recurrent networks with random feedback
TLDR
This work derives an approximation to gradient-based learning that comports with known biological features of the brain, such as causality and locality, and proposes an augmented circuit architecture that allows the RNN to concatenate short-duration patterns into sequences of longer duration. Expand
Learning to solve the credit assignment problem
TLDR
A hybrid learning approach that learns to approximate the gradient, and can match or the performance of exact gradient-based learning in both feedforward and convolutional networks. Expand
An Alternative to Backpropagation in Deep Reinforcement Learning
TLDR
An algorithm called MAP propagation is proposed that can reduce this variance significantly while retaining the local property of learning rule and can solve common reinforcement learning tasks at a speed similar to that of backpropagation when applied to an actor-critic network. Expand
Improved Stochastic Synapse Reinforcement Learning for Continuous Actions in Sharply Changing Environments
Reinforcement learning in continuous action spaces requires mechanisms that allow for exploration of infinite possible actions. One challenging issue in such systems is the amount of explorationExpand
Improved Stochastic Synapse Reinforcement Learning for Continuous Actions in Sharply Changing Environments
TLDR
Novel combinations of these equations outperform either set of equations alone in terms of both learning rate and consistency in a set of multidimensional robot inverse kinematics problems. Expand
Local online learning in recurrent networks with random feedback
TLDR
An approximation to gradient-based learning is derived that comports with biological features of the brain, such as causality and locality, by requiring synaptic weight updates to depend only on local information about pre- and postsynaptic activities. Expand
Global Reinforcement Learning in Neural Networks with Stochastic Synapses
  • Xiaolong Ma, K. Likharev
  • Computer Science
  • The 2006 IEEE International Joint Conference on Neural Network Proceedings
  • 2006
TLDR
This formulation of the REINFORCE learning principle has enabled the principle to apply to global reinforcement learning in networks with deterministic neural cells but stochastic synapses, and to suggest two groups of new learning rules for such networks, including simple local rules. Expand
Global Reinforcement Learning in Neural Networks
TLDR
Numerical simulations have shown that for simple classification and reinforcement learning tasks, at least one family of the new learning rules gives results comparable to those provided by the famous Rules Ar-i and Ar-p for the Boltzmann machines. Expand
Gradient estimation in dendritic reinforcement learning
TLDR
It is suggested that the availability of nonlocal feedback for learning is a key advantage of complex neurons over networks of simple point neurons, which have previously been found to be largely equivalent with regard to computational capability. Expand
Reinforcement Learning: An Introduction
TLDR
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 61 REFERENCES
Function Optimization using Connectionist Reinforcement Learning Algorithms
TLDR
One of these variants, called REINFORCE/MENT, represents a novel but principled approach to reinforcement learning in nontrivial networks which incorporates an entropy maximization strategy. Expand
On the use of backpropagation in associative reinforcement learning
  • R. Williams
  • Computer Science
  • IEEE 1988 International Conference on Neural Networks
  • 1988
TLDR
A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks and it is observed that such an approach even permits a seamless blend of associatives reinforcement learning and supervised learning within the same network. Expand
A stochastic reinforcement learning algorithm for learning real-valued functions
TLDR
A stochastic reinforcement learning algorithm for learning functions with continuous outputs using a connectionist network that learns to perform an underconstrained positioning task using a simulated 3 degree-of-freedom robot arm. Expand
Learning by statistical cooperation of self-interested neuron-like computing elements.
  • A. Barto
  • Computer Science, Medicine
  • Human neurobiology
  • 1985
TLDR
It is argued that some of the longstanding problems concerning adaptation and learning by networks might be solvable by this form of cooperativity, and computer simulation experiments are described that show how networks of self-interested components that are sufficiently robust can solve rather difficult learning problems. Expand
Neuronlike adaptive elements that can solve difficult learning control problems
TLDR
It is shown how a system consisting of two neuronlike adaptive elements can solve a difficult learning control problem and the relation of this work to classical and instrumental conditioning in animal learning studies and its possible implications for research in the neurosciences. Expand
Pattern-recognizing stochastic learning automata
  • A. Barto, P. Anandan
  • Computer Science
  • IEEE Transactions on Systems, Man, and Cybernetics
  • 1985
A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associative reinforcementExpand
A new approach to the design of reinforcement schemes for learning automata
TLDR
The generality of this method of designing learning schemes is pointed out, and it is shown that a very minor modification will enable the algorithm to learn in a multiteacher environment as well. Expand
Forward Models: Supervised Learning with a Distal Teacher
TLDR
This article demonstrates that certain classical problems associated with the notion of the “teacher” in supervised learning can be solved by judicious use of learned internal models as components of the adaptive system. Expand
Learning and Sequential Decision Making
TLDR
It is shown how a TD METHOD can beunderstood as a NOVEL SYNTHESIS of CONCEPTS from the theORY of STOCHASTIC DYNAMIC PROGRAMMING, which is the standard method for solving decision-making problems in binary systems. Expand
Associative search network: A reinforcement learning associative memory
An associative memory system is presented which does not require a “teacher” to provide the desired associations. For each input key it conducts a search for the output pattern which optimizes anExpand
...
1
2
3
4
5
...