Recurrent policy gradients

  title={Recurrent policy gradients},
  author={Daan Wierstra and Alexander F{\"o}rster and Jan Peters and J{\"u}rgen Schmidhuber},
  journal={Logic Journal of the IGPL},
Reinforcement learning for partially observable Markov decision problems (POMDPs) is a challenge as it requires policies with an internal state. Traditional approaches suffer significantly from this shortcoming and usually make strong assumptions on the problem domain such as perfect system models, state-estimators and a Markovian hidden system. Recurrent neural networks (RNNs) offer a natural framework for dealing with policy learning using hidden state and require only few limiting… CONTINUE READING
Highly Cited
This paper has 66 citations. REVIEW CITATIONS
Recent Discussions
This paper has been referenced on Twitter 1 time over the past 90 days. VIEW TWEETS
36 Extracted Citations
30 Extracted References
Similar Papers

Citing Papers

Publications influenced by this paper.

66 Citations

Citations per Year
Semantic Scholar estimates that this publication has 66 citations based on the available data.

See our FAQ for additional information.

Referenced Papers

Publications referenced by this paper.
Showing 1-10 of 30 references


  • S. Bhatnagar, R. Sutton, M. Ghavamzadeh
  • Lee, Incremental natural actorcritic algorithms…
  • 2007
1 Excerpt

Similar Papers

Loading similar papers…