Corpus ID: 211205004

Residual Bootstrap Exploration for Bandit Algorithms

@article{Wang2020ResidualBE,
  title={Residual Bootstrap Exploration for Bandit Algorithms},
  author={C. Wang and Yang Yu and Botao Hao and Guang Cheng},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.08436}
}
In this paper, we propose a novel perturbation-based exploration method in bandit algorithms with bounded or unbounded rewards, called residual bootstrap exploration (\texttt{ReBoot}). The \texttt{ReBoot} enforces exploration by injecting data-driven randomness through a residual-based perturbation mechanism. This novel mechanism captures the underlying distributional properties of fitting errors, and more importantly boosts exploration to escape from suboptimal solutions (for small sample… Expand
2 Citations
Sub-sampling for Efficient Non-Parametric Bandit Exploration
  • 2
  • Highly Influenced
  • PDF

References

SHOWING 1-10 OF 13 REFERENCES
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
  • 26
  • Highly Influential
  • PDF
Perturbed-History Exploration in Stochastic Multi-Armed Bandits
  • 11
  • Highly Influential
  • PDF
Bootstrapping Upper Confidence Bound
  • 7
  • PDF
Further Optimal Regret Bounds for Thompson Sampling
  • 284
  • PDF
Improved Algorithms for Linear Stochastic Bandits
  • 708
  • PDF
Finite-time Analysis of the Multiarmed Bandit Problem
  • 4,455
  • PDF
Bootstrapped Thompson Sampling and Deep Exploration
  • 51
  • PDF
A Tutorial on Thompson Sampling
  • 246
  • PDF
Ensemble Sampling
  • 49
  • PDF
Reinforcement Learning: An Introduction
  • 27,843
  • PDF
...
1
2
...