The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime
@article{Simchowitz2017TheSU, title={The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime}, author={Max Simchowitz and Kevin G. Jamieson and Benjamin Recht}, journal={ArXiv}, year={2017}, volume={abs/1702.05186} }
We propose a novel technique for analyzing adaptive sampling called the {\em Simulator}. Our approach differs from the existing methods by considering not how much information could be gathered by any fixed sampling strategy, but how difficult it is to distinguish a good sampling strategy from a bad one given the limited amount of data collected up to any given time. This change of perspective allows us to match the strength of both Fano and change-of-measure techniques, without succumbing to…
49 Citations
On the complexity of All ε-Best Arms Identification
- Computer ScienceArXiv
- 2022
The question introduced by [MJTN20] of identifying all the ε -optimal arms in a stochastic multi-armed bandit with Gaussian rewards is considered and two lower bounds on the sample complexity of any algorithm solving the problem with a confidence at least 1 − δ are given.
On the complexity of All $\varepsilon$-Best Arms Identification
- Computer Science
- 2022
The question introduced by [MJTN20] of identifying all the ε -optimal arms in a stochastic multi-armed bandit with Gaussian rewards is considered and two lower bounds on the sample complexity of any algorithm solving the problem with a confidence at least 1 − δ are given.
Structured Best Arm Identification with Fixed Confidence
- Computer Science, MathematicsALT
- 2017
This paper introduces an abstract setting to clearly describe the essential properties of the minimax game search problem, and introduces a new algorithm (LUCB-micro) for the abstract setting, and gives its lower and upper sample complexity results.
Disagreement-Based Combinatorial Pure Exploration: Sample Complexity Bounds and an Efficient Algorithm
- Computer Science, MathematicsCOLT
- 2019
We design new algorithms for the combinatorial pure exploration problem in the multi-arm bandit framework. In this problem, we are given $K$ distributions and a collection of subsets $\mathcal{V}…
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits
- Computer ScienceNeurIPS
- 2020
An algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms is provided, and in addition is computationally efficient for combinatorial classes, e.g. shortest-path, matchings and matroids.
Learning to Actively Learn: A Robust Approach
- Computer ScienceArXiv
- 2020
This work proposes a procedure for designing algorithms for specific adaptive data collection tasks like active learning and pure-exploration multi-armed bandits, and performs synthetic experiments to justify the stability and effectiveness of the training procedure, and then evaluates the method on tasks derived from real data.
Non-Asymptotic Pure Exploration by Solving Games
- Computer ScienceNeurIPS
- 2019
This work proposes sampling rules based on iterative strategies to estimate and converge to its saddle point, and applies no-regret learners to obtain the first finite confidence guarantees that are adapted to the exponential family and which apply to any pure exploration query and bandit structure.
Non-Asymptotic Pure Exploration by Solving Games
- Computer Science
- 2019
This work proposes sampling rules based on iterative strategies to estimate and converge to its saddle point, and applies no-regret learners to obtain the first finite confidence guarantees that are adapted to the exponential family and which apply to any pure exploration query and bandit structure.
A Bandit Approach to Multiple Testing with False Discovery Control
- Computer ScienceArXiv
- 2018
Inspired by the multi-armed bandit literature, an algorithm that takes as few samples as possible to exceed a target true positive proportion while giving anytime control of the false discovery proportion is provided.
A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits
- Computer ScienceAISTATS
- 2022
A new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance called Exploration-Biased Sampling is proposed, which is not only asymptotically optimal but also proves non-asymptotic bounds occurring with high probability.
References
SHOWING 1-10 OF 39 REFERENCES
On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
- Computer ScienceJ. Mach. Learn. Res.
- 2016
This work introduces generic notions of complexity for the two dominant frameworks considered in the literature: fixed-budget and fixed-confidence settings, and provides the first known distribution-dependent lower bound on the complexity that involves information-theoretic quantities and holds when m ≥ 1 under general assumptions.
Eluder Dimension and the Sample Complexity of Optimistic Exploration
- Computer ScienceNIPS
- 2013
A regret bound is developed that holds for both classes of algorithms and applies broadly and can be specialized to many model classes and depends on a new notion the authors refer to as the eluder dimension, which measures the degree of dependence among action rewards.
PAC Subset Selection in Stochastic Multi-armed Bandits
- Computer ScienceICML
- 2012
The expected sample complexity bound for LUCB is novel even for single-arm selection, and a lower bound on the worst case sample complexity of PAC algorithms for Explore-m is given.
The Sample Complexity of Exploration in the Multi-Armed Bandit Problem
- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2003
This work considers the Multi-armed bandit problem under the PAC (“probably approximately correct”) model and generalizes the lower bound to a Bayesian setting, and to the case where the statistics of the arms are known but the identities of the Arms are not.
Best-of-K-bandits
- Computer Science, MathematicsCOLT
- 2016
This paper presents distribution-dependent lower bounds based on a particular construction which force a learner to consider all N-choose-K subsets, and match naive extensions of known upper bounds in the bandit setting obtained by treating each subset as a separate arm.
lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits
- Computer ScienceCOLT
- 2014
It is proved that the UCB procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples is optimal up to constants and also shows through simulations that it provides superior performance with respect to the state-of-the-art.
Minimax Bounds for Active Learning
- Computer ScienceIEEE Transactions on Information Theory
- 2008
The achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions are studied using minimax analysis techniques to indicate the conditions under which one can expect significant gains through active learning.
Nearly Instance Optimal Sample Complexity Bounds for Top-k Arm Selection
- Computer Science, MathematicsAISTATS
- 2017
A novel complexity term is obtained to measure the sample complexity that every Best-$k$-Arm instance requires and an elimination-based algorithm is provided that matches the instance-wise lower bound within doubly-logarithmic factors.
Asymptotically Optimal Algorithms for Multiple Play Bandits with Partial Feedback
- Computer Science, MathematicsArXiv
- 2016
An asymptotic regret lower bound for any uniformly efficient algorithm in a new setting where may be less than m is derived, and an algorithm based on upper confidence bounds, KL-CUCB, is proved for single-parameter exponential families and bounded, finitely supported rewards.
On the Optimal Sample Complexity for Best Arm Identification
- Computer Science, MathematicsArXiv
- 2015
The first lower bound for BEST-1-ARM is obtained that goes beyond the classic Mannor-Tsitsiklis lower bound, by an interesting reduction from Sign to BEST- 1-ARM.