Continuous Upper Confidence Trees with Polynomial Exploration - Consistency

@inproceedings{Auger2013ContinuousUC,
  title={Continuous Upper Confidence Trees with Polynomial Exploration - Consistency},
  author={David Auger and Adrien Cou{\"e}toux and Olivier Teytaud},
  booktitle={ECML/PKDD},
  year={2013}
}
Upper Confidence Trees (UCT) are now a well known algorithm for sequential decision making; it is a provably consistent variant of Monte-Carlo Tree Search. However, the consistency is only proved in a the case where both the action space is finite. We here propose a proof in the case of fully observable Markov Decision Processes with bounded horizon, possibly including infinitely many states and infinite action spaces and arbitrary stochastic transition kernels. We illustrate the consistency on… CONTINUE READING

References

Publications referenced by this paper.
Showing 1-10 of 15 references

Bandit Based Monte-Carlo Planning

View 5 Excerpts
Highly Influenced

The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments

IEEE Transactions on Computational Intelligence and AI in Games • 2009
View 1 Excerpt

editors

A. Gerevini, A. E. Howe, A. Cesta, I. Refanidis
Proceedings of the 19th International Conference on Automated Planning and Scheduling, ICAPS 2009, Thessaloniki, Greece, September 19-23, 2009. AAAI, • 2009
View 1 Excerpt

Similar Papers

Loading similar papers…