Corpus ID: 218571440

Blind Backdoors in Deep Learning Models

@inproceedings{Bagdasaryan2021BlindBI,
  title={Blind Backdoors in Deep Learning Models},
  author={E. Bagdasaryan and Vitaly Shmatikov},
  booktitle={USENIX Security Symposium},
  year={2021}
}
We investigate a new method for injecting backdoors into machine learning models, based on poisoning the loss-value computation in the model-training code. We use it to demonstrate new classes of backdoors strictly more powerful than those in prior literature: single-pixel and physical backdoors in ImageNet models, backdoors that switch the model to a covert, privacy-violating task, and backdoors that do not require inference-time input modifications. Our attack is \emph{blind}: the attacker… Expand
Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review
TLDR
This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning, and presents key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and more efficient and practical countermeasures are solicited. Expand
Spinning Sequence-to-Sequence Models with Meta-Backdoors
TLDR
By showing that sequence-to-sequence models, such as those used for news summarization, are vulnerable to “backdoor” attacks that introduce spin into their output, this work aims to increase awareness of threats to ML supply chains and improve their trustworthiness by developing better defenses. Expand
A Real-time Defense against Website Fingerprinting Attacks
TLDR
Dolos is proposed, a system that modifies user network traffic in real time to successfully evade Website Fingerprinting attacks and is robust against a variety of adaptive countermeasures to detect or disrupt the defense. Expand
Backdoor Attacks on Network Certification via Data Poisoning
Certifiers for neural networks have made great progress towards provable robustness guarantees against evasion attacks using adversarial examples. However, introducing certifiers into deep learningExpand
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
TLDR
This work proposes BadEncoder, the first backdoor attack to self-supervised learning, which injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored imageEncoder for different downstream tasks simultaneously inherit the backdoor behavior. Expand
Can You Hear It? Backdoor Attacks via Ultrasonic Triggers
TLDR
This work explores the option of backdoor attacks to automatic speech recognition systems where the authors inject inaudible triggers, and observes that even short, non-continuous triggers result in highly successful attacks. Expand
Decentralized Deep Learning for Mobile Edge Computing: A Survey on Communication Efficiency and Trustworthiness
TLDR
The technical fundamentals of DDL are demonstrated for benefiting many walks of society through decentralized learning and the most relevant solutions from novel perspectives of communication efficiency and trustworthiness are offered. Expand
DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation
TLDR
A systematic approach is proposed to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions and envision this framework can be a good benchmark tool to advance future DNN backdoor studies. Expand
Detecting Backdoored Neural Networks with Structured Adversarial Attacks
  • Charles Yang
  • 2021
Detecting Backdoored Neural Networks with Structured Adversarial Attacks
EX-RAY: Distinguishing Injected Backdoor from Natural Features in Neural Networks by Examining Differential Feature Symmetry
TLDR
A novel symmetric feature differencing method that identifies a smallest set of features separating two classes that outperforms false positive removal methods using L2 distance and attribution techniques and demonstrates its potential in detecting a number of semantic backdoor attacks. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 123 REFERENCES
Automatic differentiation in PyTorch
TLDR
An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Expand
ImageNet Large Scale Visual Recognition Challenge
TLDR
The creation of this benchmark dataset and the advances in object recognition that have been possible as a result are described, and the state-of-the-art computer vision accuracy with human accuracy is compared. Expand
Multiple-gradient descent algorithm (MGDA) for multiobjective optimization
Abstract One considers the context of the concurrent optimization of several criteria J i ( Y ) ( i = 1 , … , n ), supposed to be smooth functions of the design vector Y ∈ R N ( n ⩽ N ). An originalExpand
Robust Anomaly Detection and Backdoor Attack Detection Via Differential Privacy
TLDR
It is demonstrated that applying differential privacy can improve the utility of outlier detection and novelty detection, with an extension to detect poisoning samples in backdoor attacks. Expand
SentiNet: Detecting Physical Attacks Against Deep Learning Systems
TLDR
This work demonstrates the effectiveness of SentiNet on three different attacks— i.e., adversarial examples, data poisoning attacks, and trojaned networks—that have large variations in deployment mechanisms, and shows that the defense is able to achieve very competitive performance metrics for all three threats, even against strong adaptive adversaries with full knowledge ofSentiNet. Expand
Universal Adversarial Perturbations
TLDR
The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers and outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images. Expand
Gradient-based learning applied to document recognition
TLDR
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand
DPATCH: An Adversarial Patch Attack on Object Detectors
TLDR
Extensive evaluations imply that DPatch can perform effective attacks under black-box setup, i.e., even without the knowledge of the attacked network's architectures and parameters, making it very practical to implement real-world attacks. Expand
Clean-Label Backdoor Attacks
TLDR
This paper introduces a new approach to executing backdoor attacks that utilizes adversarial examples and GAN-generated data and ensures that the resulting poisoned inputs appear to be consistent with their label and thus seem benign even upon human inspection. Expand
Multi-Task Learning as Multi-Objective Optimization
TLDR
This paper proposes an upper bound for the multi-objective loss and shows that it can be optimized efficiently, and proves that optimizing this upper bound yields a Pareto optimal solution under realistic assumptions. Expand
...
1
2
3
4
5
...