Corpus ID: 218571440

# Blind Backdoors in Deep Learning Models

@inproceedings{Bagdasaryan2021BlindBI,
title={Blind Backdoors in Deep Learning Models},
author={E. Bagdasaryan and Vitaly Shmatikov},
booktitle={USENIX Security Symposium},
year={2021}
}
• Published in USENIX Security Symposium 2021
• Computer Science
We investigate a new method for injecting backdoors into machine learning models, based on poisoning the loss-value computation in the model-training code. We use it to demonstrate new classes of backdoors strictly more powerful than those in prior literature: single-pixel and physical backdoors in ImageNet models, backdoors that switch the model to a covert, privacy-violating task, and backdoors that do not require inference-time input modifications. Our attack is \emph{blind}: the attacker… Expand
Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review
This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning, and presents key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and more efficient and practical countermeasures are solicited. Expand
Spinning Sequence-to-Sequence Models with Meta-Backdoors
• Computer Science
• ArXiv
• 2021
By showing that sequence-to-sequence models, such as those used for news summarization, are vulnerable to “backdoor” attacks that introduce spin into their output, this work aims to increase awareness of threats to ML supply chains and improve their trustworthiness by developing better defenses. Expand
A Real-time Defense against Website Fingerprinting Attacks
• Computer Science
• ArXiv
• 2021
Dolos is proposed, a system that modifies user network traffic in real time to successfully evade Website Fingerprinting attacks and is robust against a variety of adaptive countermeasures to detect or disrupt the defense. Expand
Backdoor Attacks on Network Certification via Data Poisoning
• Computer Science
• ArXiv
• 2021
Certifiers for neural networks have made great progress towards provable robustness guarantees against evasion attacks using adversarial examples. However, introducing certifiers into deep learningExpand
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
• Jinyuan Jia
• Computer Science
• ArXiv
• 2021
This work proposes BadEncoder, the first backdoor attack to self-supervised learning, which injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored imageEncoder for different downstream tasks simultaneously inherit the backdoor behavior. Expand
Can You Hear It? Backdoor Attacks via Ultrasonic Triggers
This work explores the option of backdoor attacks to automatic speech recognition systems where the authors inject inaudible triggers, and observes that even short, non-continuous triggers result in highly successful attacks. Expand
Decentralized Deep Learning for Mobile Edge Computing: A Survey on Communication Efficiency and Trustworthiness
• Computer Science
• ArXiv
• 2021
The technical fundamentals of DDL are demonstrated for benefiting many walks of society through decentralized learning and the most relevant solutions from novel perspectives of communication efficiency and trustworthiness are offered. Expand
DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation
• Computer Science
• AsiaCCS
• 2021
A systematic approach is proposed to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions and envision this framework can be a good benchmark tool to advance future DNN backdoor studies. Expand
Detecting Backdoored Neural Networks with Structured Adversarial Attacks
• Charles Yang
• 2021
Detecting Backdoored Neural Networks with Structured Adversarial Attacks
EX-RAY: Distinguishing Injected Backdoor from Natural Features in Neural Networks by Examining Differential Feature Symmetry
• Yingqi Liu, Xiangyu Zhang
• Computer Science
• ArXiv
• 2021
A novel symmetric feature differencing method that identifies a smallest set of features separating two classes that outperforms false positive removal methods using L2 distance and attribution techniques and demonstrates its potential in detecting a number of semantic backdoor attacks. Expand

#### References

SHOWING 1-10 OF 123 REFERENCES
Automatic differentiation in PyTorch
An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Expand
ImageNet Large Scale Visual Recognition Challenge
The creation of this benchmark dataset and the advances in object recognition that have been possible as a result are described, and the state-of-the-art computer vision accuracy with human accuracy is compared. Expand
Multiple-gradient descent algorithm (MGDA) for multiobjective optimization
Abstract One considers the context of the concurrent optimization of several criteria J i ( Y ) ( i = 1 , … , n ), supposed to be smooth functions of the design vector Y ∈ R N ( n ⩽ N ). An originalExpand
Robust Anomaly Detection and Backdoor Attack Detection Via Differential Privacy
• Computer Science
• ICLR
• 2020
It is demonstrated that applying differential privacy can improve the utility of outlier detection and novelty detection, with an extension to detect poisoning samples in backdoor attacks. Expand
SentiNet: Detecting Physical Attacks Against Deep Learning Systems
• Computer Science
• ArXiv
• 2018
This work demonstrates the effectiveness of SentiNet on three different attacks— i.e., adversarial examples, data poisoning attacks, and trojaned networks—that have large variations in deployment mechanisms, and shows that the defense is able to achieve very competitive performance metrics for all three threats, even against strong adaptive adversaries with full knowledge ofSentiNet. Expand
• Computer Science, Mathematics
• 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
• 2017
The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers and outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images. Expand
Gradient-based learning applied to document recognition
• Computer Science
• 1998
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand
DPATCH: An Adversarial Patch Attack on Object Detectors
• Xin Liu
• Computer Science
• SafeAI@AAAI
• 2019
Extensive evaluations imply that DPatch can perform effective attacks under black-box setup, i.e., even without the knowledge of the attacked network's architectures and parameters, making it very practical to implement real-world attacks. Expand
Clean-Label Backdoor Attacks
• Computer Science
• 2018
This paper introduces a new approach to executing backdoor attacks that utilizes adversarial examples and GAN-generated data and ensures that the resulting poisoned inputs appear to be consistent with their label and thus seem benign even upon human inspection. Expand
Multi-Task Learning as Multi-Objective Optimization
• Computer Science, Mathematics
• NeurIPS
• 2018
This paper proposes an upper bound for the multi-objective loss and shows that it can be optimized efficiently, and proves that optimizing this upper bound yields a Pareto optimal solution under realistic assumptions. Expand