Corpus ID: 57761147

Nearly Optimal Adaptive Procedure with Change Detection for Piecewise-Stationary Bandit

@inproceedings{Cao2019NearlyOA,
  title={Nearly Optimal Adaptive Procedure with Change Detection for Piecewise-Stationary Bandit},
  author={Y. Cao and Z. Wen and Branislav Kveton and Yao Xie},
  booktitle={AISTATS},
  year={2019}
}
  • Y. Cao, Z. Wen, +1 author Yao Xie
  • Published in AISTATS 2019
  • Computer Science, Mathematics
  • Multi-armed bandit (MAB) is a class of online learning problems where a learning agent aims to maximize its expected cumulative reward while repeatedly selecting to pull arms with unknown reward distributions. We consider a scenario where the reward distributions may change in a piecewise-stationary fashion at unknown time steps. We show that by incorporating a simple change-detection component with classic UCB algorithms to detect and adapt to changes, our so-called M-UCB algorithm can achieve… CONTINUE READING
    Be Aware of Non-Stationarity: Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits
    2
    A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits
    2
    Weighted Linear Bandits for Non-Stationary Environments
    17
    Algorithms for Non-Stationary Generalized Linear Bandits
    1
    Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits
    1
    A Linear Bandit for Seasonal Environments
    1
    Multiscale Non-stationary Stochastic Bandits
    Recurrent neural-linear posterior sampling for non-stationary bandits

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 33 REFERENCES