Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

  author={Taira Tsuchiya and Junya Honda and Masashi Sugiyama},
We investigate finite stochastic partial monitoring, which is a general model for sequential learning with limited feedback. While Thompson sampling is one of the most promising algorithms on a variety of online decision-making problems, its properties for stochastic partial monitoring have not been theoretically investigated, and the existing algorithm relies on a heuristic approximation of the posterior distribution. To mitigate these problems, we present a novel Thompson-sampling-based… 

