• Corpus ID: 239016505

Adversarial Attacks on Gaussian Process Bandits

@article{Han2021AdversarialAO,
  title={Adversarial Attacks on Gaussian Process Bandits},
  author={E. Han and Jonathan Scarlett},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.08449}
}
Gaussian processes (GP) are a widely-adopted tool used to sequentially optimize black-box functions, where evaluations are costly and potentially noisy. Recent works on GP bandits have proposed to move beyond random noise and devise algorithms robust to adversarial attacks . This paper studies this problem from the attacker’s perspective, proposing various adversarial attack methods with differing assumptions on the attacker’s strength and prior information. Our goal is to understand… 

A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

This work proposes a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants, and shows that the algorithm is robust against a variety of adversarial attacks.

A Secure Federated Data-Driven Evolutionary Multi-objective Optimization Algorithm

Experimental results on a set of widely used multi-objective optimization benchmarks show that the proposed algorithm can protect privacy and enhance security with only negligible sacrifice in the performance of federated data-driven evolutionary optimization.

References

SHOWING 1-10 OF 32 REFERENCES

Adversarial Attacks on Stochastic Bandits

An adversarial attack against two popular bandit algorithms: $\epsilon$-greedy and UCB, \emph{without} knowledge of the mean rewards is proposed, which means the attacker can easily hijack the behavior of the bandit algorithm to promote or obstruct certain actions.

Stochastic bandits robust to adversarial corruptions

We introduce a new model of stochastic bandits with adversarial corruptions which aims to capture settings where most of the input follows a stochastic pattern but some fraction of it can be

Stochastic Linear Bandits Robust to Adversarial Attacks

In a contextual setting, a setup of diverse contexts is revisited, and it is shown that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing $C$.

Adversarial Attacks on Linear Contextual Bandits

This paper studies several attack scenarios and shows that a malicious agent can force a linear contextual bandit algorithm to pull any desired arm several times over a horizon of steps, while applying adversarial modifications to either rewards or contexts that only grow logarithmically as $O(\log T)$.

Adversarially Robust Optimization with Gaussian Processes

It is shown that standard GP optimization algorithms do not exhibit the desired robustness properties, and a novel confidence-bound based algorithm StableOpt is provided for this purpose, which consistently succeeds in finding a stable maximizer where several baseline methods fail.

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

In this paper, we consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can

On Information Gain and Regret Bounds in Gaussian Process Bandits

General bounds on $\gamma_T$ are provided based on the decay rate of the eigenvalues of the GP kernel, whose specialisation for commonly used kernels, improves the existing bounds on $T$ and consequently the regret bounds relying on $gamma-T$ under numerous settings are provided.

Lenient Regret and Good-Action Identification in Gaussian Process Bandits

This paper considers the problem of finding a single “good action” according to a known pre-specified threshold, and introduces several good-action identification algorithms that exploit knowledge of the threshold.

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

This work analyzes GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design and obtaining explicit sublinear regret bounds for many commonly used covariance functions.

On Kernelized Multi-armed Bandits

This work provides two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB and GP-Thomson sampling (GP-TS) and derive corresponding regret bounds, and derives a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension.