• Corpus ID: 155100228

Data Poisoning Attacks on Stochastic Bandits

  title={Data Poisoning Attacks on Stochastic Bandits},
  author={Fang Liu and Ness B. Shroff},
Stochastic multi-armed bandits form a class of online learning problems that have important applications in online recommendation systems, adaptive medical treatment, and many others. Even though potential attacks against these learning algorithms may hijack their behavior, causing catastrophic loss in real-world applications, little is known about adversarial attacks on bandit algorithms. In this paper, we propose a framework of offline attacks on bandit algorithms and study convex… 

Figures from this paper

Observation-Free Attacks on Stochastic Bandits

It is shown that any bandit algorithm that makes decisions just using the empirical mean reward, and the number of times that arm has been pulled in the past can suffer from linear regret under data corruption attacks, a sufficient condition for a stochastic multi arm bandit algorithms to be susceptible to adversarial data corruptions.

Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack

This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks, and provides a high probability guarantee of O(log T) regret with respect to random rewards and random occurrence of attacks.

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

This work provides the first robust bandit algorithm for stochastic linear contextual bandit setting under a fully adaptive and omniscient attack with sub-linear regret and shows by experiments that the proposed algorithm improves the robustness against various kinds of popular attacks.

Action-Manipulation Attacks on Stochastic Bandits

  • Guanlin LiuL. Lai
  • Computer Science
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
This paper proposes a new class of attack named action-manipulation attack, where an adversary can change the action signal selected by the user, and investigates the attack against a very popular and widely used bandit algorithm: Upper Confidence Bound (UCB) algorithm.

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

A novel algorithm is proposed, Secure-BARBAR, which provably achieves ~ O (min f C; T= p B g ) regret with high probability against weak attackers (i.e., attackers who have to place the contamination before seeing the actual pulls of the bandit al- algorithm).

Secure-UCB: Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

It is shown that with O(log T) expected number of verifications, a simple modified version of the Explore-then-Commit type bandit algorithm can restore the order optimal O( log T) regret irrespective of the amount of contamination used by the attacker.

Adversarial Attacks on Linear Contextual Bandits

This paper studies several attack scenarios and shows that a malicious agent can force a linear contextual bandit algorithm to pull any desired arm several times over a horizon of steps, while applying adversarial modifications to either rewards or contexts that only grow logarithmically as $O(\log T)$.

Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

An algorithm for combinatorial semi-bandits with a hybrid regret bound that includes a best-of-three-worlds guarantee and multiple data-dependent regret bounds is proposed, which implies that the algorithm will perform better as long as the environment is "easy" in terms of certain metrics.

Efficient Action Poisoning Attacks on Linear Contextual Bandits

This paper proposes a new class of attacks: action poisoning attacks, where an adversary can change the action signal selected by the agent, and designs action poisoning attack schemes against linear contextual bandit algorithms in both white-box and black-box settings.

Federated Multi-Armed Bandits Under Byzantine Attacks

This work borrows tools from robust statistics and proposes a median-of-means-based estimator, Fed-MoM-UCB, to cope with the Byzantine clients of Federated multi-armed bandits, and demonstrates its effectiveness against the baselines in the presence of Byzantine attacks.



Adversarial Attacks on Stochastic Bandits

An adversarial attack against two popular bandit algorithms: $\epsilon$-greedy and UCB, \emph{without} knowledge of the mean rewards is proposed, which means the attacker can easily hijack the behavior of the bandit algorithm to promote or obstruct certain actions.

Data Poisoning Attacks in Contextual Bandits

A general attack framework based on convex optimization is provided and it is shown that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector.

Stochastic bandits robust to adversarial corruptions

We introduce a new model of stochastic bandits with adversarial corruptions which aims to capture settings where most of the input follows a stochastic pattern but some fraction of it can be

Data Poisoning Attacks against Online Learning

A systematic investigation of data poisoning attacks for online learning is initiated, and a general attack strategy is proposed, formulated as an optimization problem, that applies to both settings with some modifications.

Data Poisoning Attacks on Factorization-Based Collaborative Filtering

A data poisoning attack on collaborative filtering systems is introduced and it is demonstrated how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected.

Data Poisoning Attacks against Autoregressive Models

A method of calculating Alice's optimal attack that is computationally tractable, and empirically demonstrate its effectiveness compared to random and greedy baselines on synthetic and real-world time series data is described.

Poisoning Attacks against Support Vector Machines

It is demonstrated that an intelligent adversary can, to some extent, predict the change of the SVM's decision function due to malicious input and use this ability to construct malicious data.

Better Algorithms for Stochastic Bandits with Adversarial Corruptions

A new algorithm is presented whose regret is nearly optimal, substantially improving upon previous work and can tolerate a significant amount of corruption with virtually no degradation in performance.

Is Feature Selection Secure against Training Data Poisoning?

The results on malware detection show that feature selection methods can be significantly compromised under attack, highlighting the need for specific countermeasures.

Adversarial Attacks on Neural Network Policies

This work shows existing adversarial example crafting techniques can be used to significantly degrade test-time performance of trained policies, even with small adversarial perturbations that do not interfere with human perception.