Colored Noise Injection for Training Adversarially Robust Neural Networks
@article{Zheltonozhskii2020ColoredNI, title={Colored Noise Injection for Training Adversarially Robust Neural Networks}, author={Evgenii Zheltonozhskii and Chaim Baskin and Yaniv Nemcovsky and Brian Chmiel and Avi Mendelson and Alexander M. Bronstein}, journal={ArXiv}, year={2020}, volume={abs/2003.02188} }
Even though deep learning has shown unmatched performance on various tasks, neural networks have been shown to be vulnerable to small adversarial perturbations of the input that lead to significant performance degradation. In this work we extend the idea of adding white Gaussian noise to the network weights and activations during adversarial training (PNI) to the injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and…
3 Citations
Smoothed Inference for Adversarially-Trained Models
- Computer ScienceArXiv
- 2019
This work examines the application of randomized smoothing as a way to improve performance on unperturbed data as well as to increase robustness to adversarial attacks, and finds it lends itself well for trading-off between the model inference complexity and its performance.
On the Algorithmic Stability of Adversarial Training
- Computer ScienceNeurIPS
- 2021
It is argued that the non-differentiability issue of adversarial training causes worse algorithmic stability than their natural counterparts, and a noise injection method is considered to tackle this problem, which enables the training trajectory to avoid the occurrence of non- differentiability with dominating probability.
Resurrecting Trust in Facial Recognition: Mitigating Backdoor Attacks in Face Recognition to Prevent Potential Privacy Breaches
- Computer ScienceArXiv
- 2022
Comparisons with existing approaches show that BA-BAM provides a more practical backdoor mitigation approach for face recognition and is highly robust and incurs a maximal accuracy drop of 2.4%, while reducing the attack success rate to a maximum of 20%.
References
SHOWING 1-10 OF 27 REFERENCES
Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness Against Adversarial Attack
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
Parametric-Noise-Injection (PNI) is proposed which involves trainable Gaussian noise injection at each layer on either activation or weights through solving the Min-Max optimization problem, embedded with adversarial training, and effectively improves DNN's robustness against adversarial attack.
ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models
- Computer ScienceAISec@CCS
- 2017
An effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN is proposed, sparing the need for training substitute models and avoiding the loss in attack transferability.
Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
- Computer ScienceArXiv
- 2019
Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails…
Training Robust Deep Neural Networks via Adversarial Noise Propagation
- Computer ScienceIEEE Transactions on Image Processing
- 2021
A simple yet powerful training algorithm, named Adversarial Noise Propagation (ANP), which injects noise into the hidden layers in a layer-wise manner and enables the strong robustness for deep models against both adversarial and corrupted ones, and also significantly outperforms various adversarial defense methods.
Defending against Whitebox Adversarial Attacks via Randomized Discretization
- Computer ScienceAISTATS
- 2019
A simple and computationally efficient defense strategy: inject random Gaussian noise, discretize each pixel, and then feed the result into any pre-trained classifier, leading to a lower bound on the classification accuracy of any classifier against any (potentially whitebox) $L_\infty$-bounded adversarial attack.
Explaining and Harnessing Adversarial Examples
- Computer ScienceICLR
- 2015
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Towards Deep Learning Models Resistant to Adversarial Attacks
- Computer ScienceICLR
- 2018
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Delving into Transferable Adversarial Examples and Black-box Attacks
- Computer ScienceICLR
- 2017
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.
Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
An efficient approach is proposed to generate gradient-based attacks that induce misclassifications with low L2 norm, by decoupling the direction and the norm of the adversarial perturbation that is added to the image.
Efficient Defenses Against Adversarial Attacks
- Computer ScienceAISec@CCS
- 2017
This work proposes a new defense method based on practical observations which is easy to integrate into models and performs better than state-of-the-art defenses against adversarial attacks against deep neural networks.