Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
@article{Ruan2020LinearBW, title={Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design}, author={Yufei Ruan and Jiaqi Yang and Y. Zhou}, journal={ArXiv}, year={2020}, volume={abs/2007.01980} }
Motivated by practical needs such as large-scale learning, we study the impact of adaptivity constraints to linear contextual bandits, a central problem in online active learning. We consider two popular limited adaptivity models in literature: batch learning and rare policy switches. We show that, when the context vectors are adversarially chosen in $d$-dimensional linear contextual bandits, the learner needs $O(d \log d \log T)$ policy switches to achieve the minimax-optimal regret, and this… CONTINUE READING
Tables and Topics from this paper
Tables
4 Citations
Double Explore-then-Commit: Asymptotic Optimality and Beyond
- Computer Science, Mathematics
- ArXiv
- 2020
- 2
- PDF
Provably Efficient Reinforcement Learning with Linear Function Approximation
- Computer Science, Mathematics
- COLT
- 2020
- 102
- PDF
References
SHOWING 1-10 OF 47 REFERENCES
Sequential Batch Learning in Finite-Action Linear Contextual Bandits
- Computer Science, Mathematics
- ArXiv
- 2020
- 10
- PDF
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits
- Mathematics, Computer Science
- COLT
- 2019
- 15
- Highly Influential
- PDF
Online Learning with Switching Costs and Other Adaptive Adversaries
- Computer Science, Mathematics
- NIPS
- 2013
- 57
- PDF
Contextual Bandits with Linear Payoff Functions
- Mathematics, Computer Science
- AISTATS
- 2011
- 510
- Highly Influential
- PDF
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
- Computer Science, Mathematics
- NeurIPS
- 2020
- 21
- PDF