A Tutorial on Thompson Sampling

@article{Russo2018ATO,
  title={A Tutorial on Thompson Sampling},
  author={D. Russo and Benjamin Van Roy and A. Kazerouni and Ian Osband},
  journal={Found. Trends Mach. Learn.},
  year={2018},
  volume={11},
  pages={1-96}
}
  • D. Russo, Benjamin Van Roy, +1 author Ian Osband
  • Published 2018
  • Computer Science, Mathematics
  • Found. Trends Mach. Learn.
  • Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of… CONTINUE READING
    Thompson Sampling via Local Uncertainty
    (Sequential) Importance Sampling Bandits
    • 3
    • PDF
    Collaborative Thompson Sampling
    • 1
    Neural Thompson Sampling
    Collaborative Thompson Sampling
    Satisficing in Time-Sensitive Bandit Learning
    • 8
    • PDF
    Thompson Sampling for the MNL-Bandit
    • 31
    • PDF
    Adaptive Sequential Experiments with Unknown Information Arrival Processes.
    • 1
    • PDF
    Meta-learning of Sequential Strategies
    • 18
    • PDF
    Thompson Sampling for Dynamic Pricing
    • 4
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 91 REFERENCES
    Learning to Optimize via Information-Directed Sampling
    • 103
    • PDF
    Learning to Optimize via Posterior Sampling
    • 306
    • PDF
    (More) Efficient Reinforcement Learning via Posterior Sampling
    • 229
    • PDF
    Time-Sensitive Bandit Learning and Satisficing Thompson Sampling
    • 7
    • PDF
    Satisficing in Time-Sensitive Bandit Learning
    • 8
    • PDF
    Thompson Sampling for the MNL-Bandit
    • 31
    • PDF
    Ensemble Sampling
    • 39
    • PDF
    A Bayesian Framework for Reinforcement Learning
    • 360
    • PDF
    Thompson Sampling for Learning Parameterized Markov Decision Processes
    • 67
    • PDF