# Instance-Dependent Regret Analysis of Kernelized Bandits

@inproceedings{Shekhar2022InstanceDependentRA,
title={Instance-Dependent Regret Analysis of Kernelized Bandits},
author={Shubhanshu Shekhar and Tara Javidi},
booktitle={International Conference on Machine Learning},
year={2022}
}
• Published in
International Conference on…
12 March 2022
• Computer Science
We study the problem of designing an adaptive strategy for querying a noisy zeroth-order-oracle to efﬁciently learn about the optimizer of an unknown function f . To make the problem tractable, we assume that f lies in the reproducing kernel Hilbert space (RKHS) associated with a known kernel K , with its norm bounded by M < ∞ . Prior results, working in a minimax framework , have characterized the worst-case (over all functions in the problem class) limits on regret achievable by any algorithm…

## References

SHOWING 1-10 OF 34 REFERENCES

• Computer Science
UAI
• 2013
This work proposes KernelUCB, a kernelised UCB algorithm, and gives a cumulative regret bound through a frequentist analysis and improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function.
• Mathematics
NeurIPS
• 2021
The definition of r-covering number of a subset E of R implied by (Wainwright, 2019, Definition 5.1) is slightly stronger than the one used in this paper, because elements x1, . . . , xN of r -covers belong to E rather than just R.
• Computer Science
ICML
• 2021
In this paper, we consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can
• Computer Science
AISTATS
• 2021
General bounds on $\gamma_T$ are provided based on the decay rate of the eigenvalues of the GP kernel, whose specialisation for commonly used kernels, improves the existing bounds on $T$ and consequently the regret bounds relying on $gamma-T$ under numerous settings are provided.
• Computer Science
2022 IEEE International Symposium on Information Theory (ISIT)
• 2022
The LP-GP-UCB algorithm is proposed which augments a Gaussian process surrogate model with local polynomial estimators of the function to construct a multi-scale upper confidence bound to guide the search for the optimizer.
• Computer Science, Mathematics
J. Mach. Learn. Res.
• 2011
We consider a generalization of stochastic bandits where the set of arms, X, is allowed to be a generic measurable space and the mean-payoff function is "locally Lipschitz" with respect to a
• Computer Science
ICML
• 2017
This work provides two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB and GP-Thomson sampling (GP-TS) and derive corresponding regret bounds, and derives a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension.
• Computer Science
COLT
• 2017
This paper provides algorithm-independent lower bounds on the simple regret, measuring the suboptimality of a single point reported after $T$ rounds, and on the cumulative regret,asuring the sum of regrets over the $T$ chosen points.
• Computer Science
COLT
• 2021
The question of online confidence intervals in the RKHS setting is formalized and the main challenge seems to stem from the online (sequential) nature of the observation points.
• Computer Science, Mathematics
AISTATS
• 2021
It is demonstrated that the proposed algorithm can exploit higher-order smoothness of the function by deriving a regret upper bound of $\tilde{O}(T^\frac{d+\alpha}{d+2\alpha})$ for when $\alpha>1$, which matches existing lower bound.