• Corpus ID: 213183269

# RAB: Provable Robustness Against Backdoor Attacks

@article{Weber2020RABPR,
title={RAB: Provable Robustness Against Backdoor Attacks},
author={Maurice Weber and Xiaojun Xu and Bojan Karlas and Ce Zhang and Bo Li},
journal={ArXiv},
year={2020},
volume={abs/2003.08904}
}
• Published 19 March 2020
• Computer Science
• ArXiv
Recent studies have shown that deep neural networks are highly vulnerable to adversarial attacks, including evasion and backdoor attacks. On the defense side, there have been intensive interests in provable robustness against evasion attacks, while lack of robustness guarantees against backdoor attacks. In this paper, we focus on certifying the model robustness against general threat models. We first provide a unified framework via randomized smoothing and show it can be instantiated to certify…
84 Citations

## Figures and Tables from this paper

• Computer Science
ArXiv
• 2022
Flareon is proposed, a small, stealthy, seemingly harmless code modiﬁcation that targets the data augmentation pipeline with motion-based triggers and assumes prior knowledge of the victim model architecture, training data.
• Computer Science
CCS
• 2022
This work proposes the task of target identification, which determines whether a specific test instance is the target of a training-set attack, and builds on influence estimation, which quantifies each training instance's contribution to a model's prediction.
• Computer Science
ICLR
• 2021
Deep Partition Aggregation (DPA), a certified defense against a general poisoning threat model, uses a semi-supervised learning algorithm as its base classifier model, and SS-DPA outperforms the existing certified defense for label-flipping attacks, establishing new state-of-the-art provable defenses against poison attacks.
• Computer Science
ArXiv
• 2020
This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning, and presents key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and more efficient and practical countermeasures are solicited.
• Art
ACM Computing Surveys
• 2023
The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones,
• Computer Science
IEEE Transactions on Pattern Analysis and Machine Intelligence
• 2023
The goal of this work is to systematically categorize and discuss a wide range of dataset vulnerabilities and exploits, approaches for defending against these threats, and an array of open problems in this space.
Neural Networks, being susceptible to adversarial attacks, should face a strict level of scrutiny before being deployed in critical or adversarial applications. This paper uses ideas from Chaos
This paper investigates both user-level and instance-level privacy of FL and proposes novel mechanisms to achieve improved instance- level privacy and proves the certiﬁed robustness of DPFL under a bounded number of adversarial users or instances.
• Computer Science
ArXiv
• 2022
An algorithmic framework that can mitigate potential security vulnerabilities in a pre-trained model when clean data from its training distribution is unavailable to the defender is introduced.
• Fan Wu
• Computer Science
ICLR
• 2022
This work proposes the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated regarding different certification criteria, and proposes two certification criteria: per-state action stability and cumulative reward bound.

## References

SHOWING 1-10 OF 80 REFERENCES

• Computer Science
ICML
• 2019
Strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification on smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies.
• Computer Science, Mathematics
ICML
• 2020
It is shown that with only label statistics under random input perturbations, randomized smoothing cannot achieve nontrivial certified accuracy against perturbation of $\ell_p$-norm $\Omega(\min(1, d^{\frac{1}{p} - 1}{2}}))$, when the input dimension $d$ is large.
• Computer Science, Mathematics
NeurIPS
• 2018
Spectral signatures are identified as a new property of all known backdoor attacks, which allows tools from robust statistics to thwart the attacks and is demonstrated the efficacy of these signatures in detecting and removing poisoned examples on real image sets and state of the art neural network architectures.
• Computer Science
2019 IEEE Symposium on Security and Privacy (SP)
• 2019
This paper presents the first certified defense that both scales to large networks and datasets and applies broadly to arbitrary model types, based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism.
• Computer Science
ArXiv
• 2017
This work considers a new type of attacks, called backdoor attacks, where the attacker's goal is to create a backdoor into a learning-based authentication system, so that he can easily circumvent the system by leveraging the backdoor.
• Computer Science
CCS
• 2016
This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
• Computer Science, Mathematics
ICML
• 2020
This work presents a unifying view of randomized smoothing over arbitrary functions, and uses this novel characterization to propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
• Computer Science
SafeAI@AAAI
• 2019
This work proposes a novel approach to backdoor detection and removal for neural networks that is the first methodology capable of detecting poisonous data crafted to insert backdoors and repairing the model that does not require a verified and trusted dataset.
• Computer Science, Mathematics
ACISP
• 2003
A new design is proposed that eliminates the need for known plaintext entirely and employs "data compression" as a basic tool for generating a hidden information channel, highlighting the need to only encrypt compressed strings when a block cipher with a secret design must be used.
• Computer Science
ArXiv
• 2017
It is shown that outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has state-of-the-art performance on the user's training and validation samples, but behaves badly on specific attacker-chosen inputs.