Simple statistical gradient-following algorithms for connectionist reinforcement learning

  title={Simple statistical gradient-following algorithms for connectionist reinforcement learning},
  author={Ronald J. Williams},
  journal={Machine Learning},
This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. These algorithms, called REINFORCE algorithms, are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reinforcement tasks, and they do this without explicitly computing gradient estimates or even storing information from which… CONTINUE READING
Highly Influential
This paper has highly influenced 44 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 500 citations. REVIEW CITATIONS

From This Paper

Topics from this paper.


Publications citing this paper.
Showing 1-10 of 246 extracted citations

Efficient Dialog Policy Learning via Positive Memory Retention

2018 IEEE Spoken Language Technology Workshop (SLT) • 2018
View 11 Excerpts
Method Support
Highly Influenced

How to S EQ 2 S EQ for SQL Anonymous submission

View 5 Excerpts
Highly Influenced

Iterative policy learning in end-to-end trainable task-oriented neural dialog models

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) • 2017
View 8 Excerpts
Method Support
Highly Influenced

Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision

Lacey Chabert, S. Macfarlane, +3 authors Alec Sulkin
View 6 Excerpts
Method Support
Highly Influenced

501 Citations

Citations per Year
Semantic Scholar estimates that this publication has 501 citations based on the available data.

See our FAQ for additional information.