# Nearly Optimal Adaptive Procedure with Change Detection for Piecewise-Stationary Bandit

@inproceedings{Cao2019NearlyOA, title={Nearly Optimal Adaptive Procedure with Change Detection for Piecewise-Stationary Bandit}, author={Y. Cao and Z. Wen and Branislav Kveton and Yao Xie}, booktitle={AISTATS}, year={2019} }

Multi-armed bandit (MAB) is a class of online learning problems where a learning agent aims to maximize its expected cumulative reward while repeatedly selecting to pull arms with unknown reward distributions. We consider a scenario where the reward distributions may change in a piecewise-stationary fashion at unknown time steps. We show that by incorporating a simple change-detection component with classic UCB algorithms to detect and adapt to changes, our so-called M-UCB algorithm can achieve… CONTINUE READING

17 Citations

Be Aware of Non-Stationarity: Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits

- Mathematics, Computer Science
- 2019

2

A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits

- Mathematics, Computer Science
- 2020

2

Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits

- Computer Science, Mathematics
- 2019

1

Contextual-Bandit Based Personalized Recommendation with Time-Varying User Interests

- Computer Science, Mathematics
- 2020

#### References

##### Publications referenced by this paper.

SHOWING 1-10 OF 33 REFERENCES

A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem

- Computer Science, Mathematics
- 2018

33

Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection

- Computer Science, Mathematics
- 2013

27