The non-stationary stochastic multi-armed bandit problem

@article{Allesiardo2017TheNS,
  title={The non-stationary stochastic multi-armed bandit problem},
  author={Robin Allesiardo and R. F{\'e}raud and O. Maillard},
  journal={International Journal of Data Science and Analytics},
  year={2017},
  volume={3},
  pages={267-283}
}
  • Robin Allesiardo, R. Féraud, O. Maillard
  • Published 2017
  • Mathematics, Computer Science
  • International Journal of Data Science and Analytics
  • We consider a variant of the stochastic multi-armed bandit with K arms where the rewards are not assumed to be identically distributed, but are generated by a non-stationary stochastic process. We first study the unique best arm setting when there exists one unique best arm. Second, we study the general switching best arm setting when a best arm switches at some unknown steps. For both settings, we target problem-dependent bounds, instead of the more conservative problem-free bounds. We… CONTINUE READING
    29 Citations

    Figures, Tables, and Topics from this paper

    Best-Arm Identification for Quantile Bandits with Privacy
    • PDF
    Adaptively Tracking the Best Bandit Arm with an Unknown Number of Distribution Changes
    • 22
    • PDF
    Sliding-Window Thompson Sampling for Non-Stationary Settings
    • 1
    • Highly Influenced
    • PDF
    Online Model Selection: a Rested Bandit Formulation
    • PDF
    Decentralized Exploration in Multi-Armed Bandits
    • 6
    • PDF
    Best Arm Identification for Contaminated Bandits
    • 19
    • PDF
    Best of both worlds: Stochastic & adversarial best-arm identification
    • 9
    • Highly Influenced
    • PDF
    Learning and Optimization with Seasonal Patterns
    • 1
    • PDF
    Universal Best Arm Identification
    • Cong Shen
    • Computer Science
    • IEEE Transactions on Signal Processing
    • 2019
    • 3

    References

    SHOWING 1-10 OF 21 REFERENCES
    On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
    • 681
    • PDF
    On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems
    • 172
    • Highly Influential
    • PDF
    Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
    • 1,710
    • PDF
    The Nonstochastic Multiarmed Bandit Problem
    • 1,673
    • Highly Influential
    • PDF
    EXP3 with drift detection for the switching bandit problem
    • 27
    Finite-time Analysis of the Multiarmed Bandit Problem
    • 4,412
    • Highly Influential
    • PDF
    Piecewise-stationary bandit problems with side observations
    • 53
    • PDF
    Explore no more: Improved high-probability regret bounds for non-stochastic bandits
    • 45
    • PDF
    Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems
    • 378
    • Highly Influential
    • PDF
    On Explore-Then-Commit strategies
    • 37
    • PDF