Policy evaluation with temporal differences: a survey and comparison

@article{Dann2014PolicyEW,
  title={Policy evaluation with temporal differences: a survey and comparison},
  author={Christoph Dann and Gerhard Neumann and Jan Peters},
  journal={Journal of Machine Learning Research},
  year={2014},
  volume={15},
  pages={809-883}
}
Policy evaluation is an essential step in most reinforcement learning approaches. It yields a value function, the quality assessment of states for a given policy, which can be used in a policy improvement step. Since the late 1980s, this research area has been dominated by temporal-difference (TD) methods due to their data-efficiency. However, core issues such as stability guarantees in the off-policy scenario, improved sample efficiency and probabilistic treatment of the uncertainty in the… CONTINUE READING

Topics

Statistics

01020302012201320142015201620172018
Citations per Year

89 Citations

Semantic Scholar estimates that this publication has 89 citations based on the available data.

See our FAQ for additional information.