Gaussian Process Bandit Optimization with Few Batches
@inproceedings{Li2021GaussianPB, title={Gaussian Process Bandit Optimization with Few Batches}, author={Zihan Li and Jonathan Scarlett}, booktitle={International Conference on Artificial Intelligence and Statistics}, year={2021} }
In this paper, we consider the problem of black-box optimization using Gaussian Process (GP) bandit optimization with a small number of batches. Assuming the unknown function has a low norm in the Reproducing Kernel Hilbert Space (RKHS), we introduce a batch algorithm inspired by batched finite-arm bandit algorithms, and show that it achieves the cumulative regret upper bound O ∗ ( √ Tγ T ) using O (log log T ) batches within time horizon T , where the O ∗ ( · ) notation hides dimension…
14 Citations
Instance-Dependent Regret Analysis of Kernelized Bandits
- Computer ScienceICML
- 2022
First, instance-dependent regret lower bounds for algorithms with uniformly (over the function class) vanishing normalized cumulative regret are derived, valid for several practically relevant kernelized bandits algorithms, such as, GP-UCB , GP-TS and SupKernelUCB.
Regret Bounds for Noise-Free Cascaded Kernelized Bandits
- Computer ScienceArXiv
- 2022
This work proposes a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret and provides algorithm-independent lower bounds on the simple regret and cumulative regret, showing that GPN -UCB is near-optimal for chains and multi-output chains in broad cases of interest.
Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS
- Computer Science2022 IEEE International Symposium on Information Theory (ISIT)
- 2022
The LP-GP-UCB algorithm is proposed which augments a Gaussian process surrogate model with local polynomial estimators of the function to construct a multi-scale upper confidence bound to guide the search for the optimizer.
A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits
- Computer ScienceArXiv
- 2022
This work proposes a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants, and shows that the algorithm is robust against a variety of adversarial attacks.
Sample-Then-Optimize Batch Neural Thompson Sampling
- Computer ScienceArXiv
- 2022
Two algorithms based on the Thompson sampling (TS) policy named Sample-Then-Optimize Batch Neural TS (STO-BNTS) and STO -BNTS-Linear are introduced and derive regret upper bounds for their algorithms with batch evaluations, and use insights from batch BO and NTK to show that they are asymptotically no-regret under certain conditions.
Regret Bounds for Noise-Free Kernel-Based Bandits
- Computer Science
- 2020
Several upper bounds on regret are discussed; none of which seem order optimal, and a conjecture on the order optimal regret bound is provided.
Open Problem: Tight Online Confidence Intervals for RKHS Elements
- Computer ScienceCOLT
- 2021
The question of online confidence intervals in the RKHS setting is formalized and the main challenge seems to stem from the online (sequential) nature of the observation points.
Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning
- Computer ScienceICML
- 2022
Novel confidence intervals are provided for the Nystr ¨ om method and the sparse variational Gaussian process approximation method, which are established using novel interpretations of the approximate (surrogate) posterior variance of the models.
Bayesian Optimization under Stochastic Delayed Feedback
- Computer ScienceICML
- 2022
Algorithms with sub-linear regret guarantees that address the dilemma of selecting new function queries while waiting for randomly delayed feedback are proposed.
Provably and Practically Efficient Neural Contextual Bandits
- Computer ScienceArXiv
- 2022
The non-asymptotic error bounds are derived on the difference between an overparameterized neural net and its corresponding neural tangent kernel and an algorithm with a provably sublinear regret bound that is also efficient in the finite regime is proposed.
References
SHOWING 1-10 OF 30 REFERENCES
Optimal Order Simple Regret for Gaussian Process Bandits
- Computer ScienceNeurIPS
- 2021
This work proves an Õ( √ γN/N) bound on the simple regret performance of a pure exploration algorithm that is significantly tighter than the existing bounds and is order optimal up to logarithmic factors for the cases where a lower bound on regret is known.
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
- Computer ScienceICML
- 2010
This work analyzes GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design and obtaining explicit sublinear regret bounds for many commonly used covariance functions.
On Kernelized Multi-armed Bandits
- Computer ScienceICML
- 2017
This work provides two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB and GP-Thomson sampling (GP-TS) and derive corresponding regret bounds, and derives a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension.
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret
- Computer ScienceCOLT
- 2019
BKB (budgeted kernelized bandit), a new approximate GP algorithm for optimization under bandit feedback that achieves near-optimal regret (and hence near-Optimal convergence rate) with near-constant per-iteration complexity and remarkably no assumption on the input space or covariance of the GP.
Lenient Regret and Good-Action Identification in Gaussian Process Bandits
- Computer ScienceICML
- 2021
This paper considers the problem of finding a single “good action” according to a known pre-specified threshold, and introduces several good-action identification algorithms that exploit knowledge of the threshold.
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
- Computer ScienceICML
- 2021
In this paper, we consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can…
Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization
- Computer ScienceCOLT
- 2017
This paper provides algorithm-independent lower bounds on the simple regret, measuring the suboptimality of a single point reported after $T$ rounds, and on the cumulative regret,asuring the sum of regrets over the $T $ chosen points.
Finite-Time Analysis of Kernelised Contextual Bandits
- Computer ScienceUAI
- 2013
This work proposes KernelUCB, a kernelised UCB algorithm, and gives a cumulative regret bound through a frequentist analysis and improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function.
On Information Gain and Regret Bounds in Gaussian Process Bandits
- Computer ScienceAISTATS
- 2021
General bounds on $\gamma_T$ are provided based on the decay rate of the eigenvalues of the GP kernel, whose specialisation for commonly used kernels, improves the existing bounds on $T$ and consequently the regret bounds relying on $gamma-T$ under numerous settings are provided.
Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS
- Computer Science2022 IEEE International Symposium on Information Theory (ISIT)
- 2022
The LP-GP-UCB algorithm is proposed which augments a Gaussian process surrogate model with local polynomial estimators of the function to construct a multi-scale upper confidence bound to guide the search for the optimizer.