• Corpus ID: 239016505

# Adversarial Attacks on Gaussian Process Bandits

@article{Han2021AdversarialAO,
title={Adversarial Attacks on Gaussian Process Bandits},
author={E. Han and Jonathan Scarlett},
journal={ArXiv},
year={2021},
volume={abs/2110.08449}
}
• Published 16 October 2021
• Computer Science
• ArXiv
Gaussian processes (GP) are a widely-adopted tool used to sequentially optimize black-box functions, where evaluations are costly and potentially noisy. Recent works on GP bandits have proposed to move beyond random noise and devise algorithms robust to adversarial attacks . This paper studies this problem from the attacker’s perspective, proposing various adversarial attack methods with differing assumptions on the attacker’s strength and prior information. Our goal is to understand…
2 Citations

## Figures and Tables from this paper

• Computer Science
ArXiv
• 2022
This work proposes a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants, and shows that the algorithm is robust against a variety of adversarial attacks.
• Computer Science
ArXiv
• 2022
Experimental results on a set of widely used multi-objective optimization benchmarks show that the proposed algorithm can protect privacy and enhance security with only negligible sacriﬁce in the performance of federated data-driven evolutionary optimization.

## References

SHOWING 1-10 OF 32 REFERENCES

An adversarial attack against two popular bandit algorithms: $\epsilon$-greedy and UCB, \emph{without} knowledge of the mean rewards is proposed, which means the attacker can easily hijack the behavior of the bandit algorithm to promote or obstruct certain actions.
• Computer Science
STOC
• 2018
We introduce a new model of stochastic bandits with adversarial corruptions which aims to capture settings where most of the input follows a stochastic pattern but some fraction of it can be
• Computer Science
AISTATS
• 2021
In a contextual setting, a setup of diverse contexts is revisited, and it is shown that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing $C$.
• Computer Science
NeurIPS
• 2020
This paper studies several attack scenarios and shows that a malicious agent can force a linear contextual bandit algorithm to pull any desired arm several times over a horizon of steps, while applying adversarial modifications to either rewards or contexts that only grow logarithmically as $O(\log T)$.
• Computer Science
NeurIPS
• 2018
It is shown that standard GP optimization algorithms do not exhibit the desired robustness properties, and a novel confidence-bound based algorithm StableOpt is provided for this purpose, which consistently succeeds in finding a stable maximizer where several baseline methods fail.
• Computer Science
ICML
• 2021
In this paper, we consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can
• Computer Science
AISTATS
• 2021
General bounds on $\gamma_T$ are provided based on the decay rate of the eigenvalues of the GP kernel, whose specialisation for commonly used kernels, improves the existing bounds on $T$ and consequently the regret bounds relying on $gamma-T$ under numerous settings are provided.
• Computer Science
ICML
• 2021
This paper considers the problem of finding a single “good action” according to a known pre-specified threshold, and introduces several good-action identification algorithms that exploit knowledge of the threshold.
• Computer Science
ICML
• 2010
This work analyzes GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design and obtaining explicit sublinear regret bounds for many commonly used covariance functions.
• Computer Science
ICML
• 2017
This work provides two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB and GP-Thomson sampling (GP-TS) and derive corresponding regret bounds, and derives a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension.