Bayesian actor-critic algorithms

@inproceedings{Ghavamzadeh2007BayesianAA,
  title={Bayesian actor-critic algorithms},
  author={Mohammad Ghavamzadeh and Yaakov Engel},
  booktitle={ICML},
  year={2007}
}
We present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model the state-action value function as a Gaussian process, allowing Bayes' rule to be used in computing the posterior distribution over state-action value functions, conditioned on the observed data. Appropriate choices of the prior covariance (kernel) between state-action values and of the parametrization of the policy… CONTINUE READING
Highly Cited
This paper has 48 citations. REVIEW CITATIONS