Thompson sampling

Known as: Bayesian control rule 
In artificial intelligence, Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration… (More)
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2017
2017
We derive an alternative proof for the regret of Thompson sampling (TS) in the stochastic linear bandit setting. While we obtain… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2016
Highly Cited
2016
We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization… (More)
Is this relevant?
2015
2015
Matrix factorization (MF) collaborative filtering is an effective and widely used method in recommendation systems. However, the… (More)
  • table 1
  • figure 2
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
2014
Highly Cited
2014
We consider stochastic multi-armed bandit problems with complex actions over a set of basic arms, where the decision maker plays… (More)
Is this relevant?
2014
2014
Thompson sampling provides a solution to bandit problems in which new observations are allocated to arms with the posterior… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
2013
Highly Cited
2013
Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian… (More)
Is this relevant?
2013
2013
Thompson Sampling has been demonstrated in many complex ban dit models, however the theoretical guarantees available for the… (More)
  • figure 1
Is this relevant?
Highly Cited
2012
Highly Cited
2012
The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since… (More)
  • figure 1
  • figure 2
Is this relevant?
Highly Cited
2012
Highly Cited
2012
We show that the Thompson Sampling algorithm achieves logarithmic expected regret for the Bernoulli multi-armed bandit problem… (More)
  • figure 1
Is this relevant?
Highly Cited
2011
Highly Cited
2011
Thompson sampling is one of oldest heuristic to address the e xploration / exploitation trade-off, but it is surprisingly… (More)
  • figure 1
  • figure 2
  • figure 3
  • table 1
  • table 2
Is this relevant?