Corpus ID: 36122023

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

@article{Chen2017TargetedBA,
  title={Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning},
  author={Xinyun Chen and Chang Liu and Bo Li and Kimberly Lu and D. Song},
  journal={ArXiv},
  year={2017},
  volume={abs/1712.05526}
}
Deep learning models have achieved high performance on many tasks, and thus have been applied to many security-critical scenarios. [...] Key Result Our work demonstrates that backdoor poisoning attacks pose real threats to a learning system, and thus highlights the importance of further investigation and proposing defense strategies against them.Expand
Detecting Backdoor in Deep Neural Networks via Intentional Adversarial Perturbations
TLDR
A novel backdoor detection method based on adversarial examples is proposed that has better detection performance on all the three datasets, and is more efficient than STRIP. Expand
Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation
TLDR
This paper proposes two approaches for generating a backdoor that is hardly perceptible yet effective in poisoning the model, and demonstrates that such attacks can be effective and achieve a high attack success rate at a small cost of model accuracy loss with a small injection rate. Expand
DeepCleanse: A Black-box Input SanitizationFramework Against Backdoor Attacks on DeepNeural Networks
TLDR
To the best of the knowledge, this is the first method in backdoor defense that works in black-box setting capable of sanitizing and restoring trojaned input that neither requires costly ground-truth labeled data nor anomaly detection. Expand
Black-box Detection of Backdoor Attacks with Limited Information and Data
TLDR
A blackbox backdoor detection (B3D) method to identify backdoor attacks with only query access to the model with a gradient-free optimization algorithm to reverse-engineer the potential trigger for each class, which helps to reveal the existence of backdoor attacks. Expand
Input-Aware Dynamic Backdoor Attack
TLDR
A novel backdoor attack technique in which the triggers vary from input to input, and an input-aware trigger generator driven by diversity loss is implemented, making backdoor verification impossible. Expand
A Backdoor Attack Against LSTM-Based Text Classification Systems
TLDR
A backdoor attack against LSTM-based text classification by data poisoning, where the adversary will inject backdoors into the model and then cause the misbehavior of the model through inputs including backdoor triggers. Expand
Label-Consistent Backdoor Attacks
TLDR
This work leverages adversarial perturbations and generative models to execute efficient, yet label-consistent, backdoor attacks, based on injecting inputs that appear plausible, yet are hard to classify, hence causing the model to rely on the (easier-to-learn) backdoor trigger. Expand
Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features
With the prevalent use of Deep Neural Networks (DNNs) in many applications, security of these networks is of importance. Pre-trained DNNs may contain backdoors that are injected through poisonedExpand
Escaping Backdoor Attack Detection of Deep Learning
TLDR
This work evaluates their attack against two state-of-the-art detections on three different data sets, and demonstrates that the attack is able to successfully insert target backdoors and also escape the detections. Expand
Exploring the Landscape of Backdoor Attacks on Deep Neural Network Models
Deep neural networks have recently been demonstrated to be vulnerable to backdoor attacks. Specifically, by introducing a small set of training inputs, an adversary is able to plant a backdoor in theExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 68 REFERENCES
Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples
TLDR
This work introduces the first practical demonstration that cross-model transfer phenomenon enables attackers to control a remotely hosted DNN with no access to the model, its parameters, or its training data, and introduces the attack strategy of fitting a substitute model to the input-output pairs in this manner, then crafting adversarial examples based on this auxiliary model. Expand
Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization
TLDR
This work proposes a novel poisoning algorithm based on the idea of back-gradient optimization, able to target a wider class of learning algorithms, trained with gradient-based procedures, including neural networks and deep learning architectures, and empirically evaluates its effectiveness on several application examples. Expand
Is Feature Selection Secure against Training Data Poisoning?
TLDR
The results on malware detection show that feature selection methods can be significantly compromised under attack, highlighting the need for specific countermeasures. Expand
Practical Black-Box Attacks against Machine Learning
TLDR
This work introduces the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge, and finds that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder. Expand
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
TLDR
It is shown that outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has state-of-the-art performance on the user's training and validation samples, but behaves badly on specific attacker-chosen inputs. Expand
Certified Defenses for Data Poisoning Attacks
TLDR
This work addresses the worst-case loss of a defense in the face of a determined attacker by constructing approximate upper bounds on the loss across a broad family of attacks, for defenders that first perform outlier removal followed by empirical risk minimization. Expand
Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples
TLDR
New transferability attacks between previously unexplored (substitute, victim) pairs of machine learning model classes, most notably SVMs and decision trees are introduced. Expand
Adversarial examples in the physical world
TLDR
It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples. Expand
Generative Poisoning Attack Method Against Neural Networks
TLDR
This work first examines the possibility of applying traditional gradient-based method to generate poisoned data against NNs by leveraging the gradient of the target model w.r.t. the normal data, and proposes a generative method to accelerate the generation rate of the poisoned data. Expand
Trojaning Attack on Neural Networks
TLDR
A trojaning attack on neuron networks that can be successfully triggered without affecting its test accuracy for normal input data, and it only takes a small amount of time to attack a complex neuron network model. Expand
...
1
2
3
4
5
...