• Corpus ID: 252408855

Minimax Optimal Fixed-Budget Best Arm Identification in Linear Bandits

  title={Minimax Optimal Fixed-Budget Best Arm Identification in Linear Bandits},
  author={Junwen Yang and Vincent Yan Fu Tan},
We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We provide a theoretical analysis of the failure probability of OD-LinBAI. Instead of all the optimality gaps, the performance of OD-LinBAI depends only on the gaps of the top d arms, where d is the… 

Figures and Tables from this paper



Best-Arm Identification in Linear Bandits

The importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms is shown and the connection to the G-optimality criterion used in optimal experimental design is pointed out.

Optimal Best-arm Identification in Linear Bandits

A simple algorithm is devised whose sampling complexity matches known instance-specific lower bounds, asymptotically almost surely and in expectation, and that remarkably can be updated as rarely as the authors wish, without compromising its theoretical guarantees.

On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

The side-observation model, where pulling an arm reveals the rewards of its related arms, is studied, and improved theoretical guarantees in the pure-exploration setting are established, and the nonlinear algorithm outperforms the state-of-the-art.

Best Arm Identification in Linear Bandits with Linear Dimension Dependency

An algorithm whose sample complexity depends linearly on the dimension d, as well as an algorithm with sample complexity dependent on the reward gaps of the best d arms, matching the lower bound arising from the ordinary top-arm identification problem.

Almost Optimal Exploration in Multi-Armed Bandits

Two novel, parameter-free algorithms for identifying the best arm, in two different settings: given a target confidence and given atarget budget of arm pulls, are presented, for which upper bounds whose gap from the lower bound is only doubly-logarithmic in the problem parameters are proved.

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

An algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms is provided, and in addition is computationally efficient for combinatorial classes, e.g. shortest-path, matchings and matroids.

Stochastic Linear Optimization under Bandit Feedback

A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.

Robust Pure Exploration in Linear Bandits with Limited Budget

A new algorithm is provided that identifies the best arm with high probability while being robust to unknown levels of observation noise as well as to moderate levels of misspecification in the linear model.

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

A performance bound is proved for the two versions of the UGapE algorithm showing that the two problems are characterized by the same notion of complexity.

Best Arm Identification in Generalized Linear Bandits