Learning Continuous Control Policies by Stochastic Value Gradients

@inproceedings{Heess2015LearningCC,
  title={Learning Continuous Control Policies by Stochastic Value Gradients},
  author={Nicolas Heess and Greg Wayne and David Silver and Timothy P. Lillicrap and Yuval Tassa and Tom Erez},
  booktitle={NIPS},
  year={2015}
}
We present a unified framework for learning continuous control policies using backpropagation. It supports stochastic control by treating stochasticity in the Bellman equation as a deterministic function of exogenous noise. The product is a spectrum of general policy gradient algorithms that range from model-free methods with value functions to model-based methods without value functions. We use learned models but only require observations from the environment instead of observations from model… CONTINUE READING
Highly Influential
This paper has highly influenced 19 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 176 citations. REVIEW CITATIONS
Related Discussions
This paper has been referenced on Twitter 22 times. VIEW TWEETS

From This Paper

Figures, tables, and topics from this paper.

Citations

Publications citing this paper.

176 Citations

05020152016201720182019
Citations per Year
Semantic Scholar estimates that this publication has 176 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…