Exploration in Gradient-Based Reinforcement Learning

  title={Exploration in Gradient-Based Reinforcement Learning},
  author={Nicolas Meuleau and Leonid Peshkin},
Gradient-based policy search is an alternative to value-function-based methods for reinforcement learning in non-Markovian domains. One apparent drawback of policy search is its requirement that all actions be \on-policy"; that is, that there be no explicit exploration. In this paper, we provide a method for using importance sampling to allow any well-behaved directed exploration policy during learning. We show both theoretically and experimentally that using this method can achieve dramatic… CONTINUE READING
Highly Cited
This paper has 51 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 29 extracted citations

51 Citations

Citations per Year
Semantic Scholar estimates that this publication has 51 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 24 references

The complexity of the rst trial in REINFORCE

  • N. Meuleau
  • 2000
Highly Influential
4 Excerpts

A biologically plausible and locally optimal learning algorithm for spiking neurons (Technical Report)

  • P. Bartlett, J. Baxter
  • 2000
1 Excerpt

Similar Papers

Loading similar papers…