Minimax Optimal Fixed-Budget Best Arm Identification in Linear Bandits
@inproceedings{Yang2021MinimaxOF, title={Minimax Optimal Fixed-Budget Best Arm Identification in Linear Bandits}, author={Junwen Yang and Vincent Yan Fu Tan}, year={2021} }
We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We provide a theoretical analysis of the failure probability of OD-LinBAI. Instead of all the optimality gaps, the performance of OD-LinBAI depends only on the gaps of the top d arms, where d is the…
Figures and Tables from this paper
References
SHOWING 1-10 OF 45 REFERENCES
Best-Arm Identification in Linear Bandits
- Computer ScienceNIPS
- 2014
The importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms is shown and the connection to the G-optimality criterion used in optimal experimental design is pointed out.
Optimal Best-arm Identification in Linear Bandits
- Computer ScienceNeurIPS
- 2020
A simple algorithm is devised whose sampling complexity matches known instance-specific lower bounds, asymptotically almost surely and in expectation, and that remarkably can be updated as rarely as the authors wish, without compromising its theoretical guarantees.
On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits
- Computer ScienceIEEE Transactions on Signal Processing
- 2017
The side-observation model, where pulling an arm reveals the rewards of its related arms, is studied, and improved theoretical guarantees in the pure-exploration setting are established, and the nonlinear algorithm outperforms the state-of-the-art.
Best Arm Identification in Linear Bandits with Linear Dimension Dependency
- Computer ScienceICML
- 2018
An algorithm whose sample complexity depends linearly on the dimension d, as well as an algorithm with sample complexity dependent on the reward gaps of the best d arms, matching the lower bound arising from the ordinary top-arm identification problem.
Almost Optimal Exploration in Multi-Armed Bandits
- Computer ScienceICML
- 2013
Two novel, parameter-free algorithms for identifying the best arm, in two different settings: given a target confidence and given atarget budget of arm pulls, are presented, for which upper bounds whose gap from the lower bound is only doubly-logarithmic in the problem parameters are proved.
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits
- Computer ScienceNeurIPS
- 2020
An algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms is provided, and in addition is computationally efficient for combinatorial classes, e.g. shortest-path, matchings and matroids.
Stochastic Linear Optimization under Bandit Feedback
- Computer Science, MathematicsCOLT
- 2008
A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.
Robust Pure Exploration in Linear Bandits with Limited Budget
- Computer ScienceICML
- 2021
A new algorithm is provided that identifies the best arm with high probability while being robust to unknown levels of observation noise as well as to moderate levels of misspecification in the linear model.
Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
- Computer ScienceNIPS
- 2012
A performance bound is proved for the two versions of the UGapE algorithm showing that the two problems are characterized by the same notion of complexity.