Best Arm Identification under Additive Transfer Bandits
@article{Neopane2021BestAI, title={Best Arm Identification under Additive Transfer Bandits}, author={Ojash Neopane and Aaditya Ramdas and Aarti Singh}, journal={2021 55th Asilomar Conference on Signals, Systems, and Computers}, year={2021}, pages={464-470} }
We consider a variant of the best arm identification (BAI) problem in multi-armed bandits (MAB) in which there are two sets of arms (source and target), and the objective is to determine the best target arm while only pulling source arms. In this paper, we study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances. We show how our framework covers a range of previously studied pure exploration problems and…
One Citation
Max-Quantile Grouped Infinite-Arm Bandits
- Computer ScienceArXiv
- 2022
The instance-dependent and worst-case regret are characterized, and a matching lower bound for the latter is provided, while discussing various strengths, weaknesses, algorithmic improvements, and potential lower bounds associated with the instance- dependent upper bounds.
References
SHOWING 1-10 OF 27 REFERENCES
Structured Best Arm Identification with Fixed Confidence
- Computer Science, MathematicsALT
- 2017
This paper introduces an abstract setting to clearly describe the essential properties of the minimax game search problem, and introduces a new algorithm (LUCB-micro) for the abstract setting, and gives its lower and upper sample complexity results.
Pure Exploration of Multi-armed Bandit Under Matroid Constraints
- Computer Science, MathematicsCOLT
- 2016
This work studies both the exact and PAC versions of Best-Basis, and provides algorithms with nearly-optimal sample complexities for these versions of the pure exploration problem subject to a matroid constraint in a stochastic multi-armed bandit game.
On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
- Computer ScienceJ. Mach. Learn. Res.
- 2016
This work introduces generic notions of complexity for the two dominant frameworks considered in the literature: fixed-budget and fixed-confidence settings, and provides the first known distribution-dependent lower bound on the complexity that involves information-theoretic quantities and holds when m ≥ 1 under general assumptions.
Exploiting Correlation in Finite-Armed Structured Bandits
- Computer ScienceArXiv
- 2018
We consider a correlated multi-armed bandit problem in which rewards of arms are correlated through a hidden parameter. Our approach exploits the correlation among arms to identify some arms as…
PAC Subset Selection in Stochastic Multi-armed Bandits
- Computer ScienceICML
- 2012
The expected sample complexity bound for LUCB is novel even for single-arm selection, and a lower bound on the worst case sample complexity of PAC algorithms for Explore-m is given.
Combinatorial Pure Exploration of Multi-Armed Bandits
- Computer Science, MathematicsNIPS
- 2014
This paper presents general learning algorithms which work for all decision classes that admit offline maximization oracles in both fixed confidence and fixed budget settings and establishes a general problem-dependent lower bound for the CPE problem.
Information Complexity in Bandit Subset Selection
- Computer ScienceCOLT
- 2013
This work considers the problem of eciently exploring the arms of a stochastic bandit to identify the best subset of a specied size and derives improved bounds by using KL-divergence-based condence intervals.
Pure Exploration in Multi-armed Bandits Problems
- Computer ScienceALT
- 2009
The main result is that the required exploration-exploitation trade-offs are qualitatively different, in view of a general lower bound on the simple regret in terms of the cumulative regret.
Monte-Carlo Tree Search by Best Arm Identification
- Computer ScienceNIPS
- 2017
New algorithms for trees of arbitrary depth are developed, that operate by summarizing all deeper levels of the tree into confidence intervals at depth one, and applying a best arm identification procedure at the root, to prove new sample complexity guarantees with a refined dependence on the problem instance.