• Corpus ID: 211987801

Colored Noise Injection for Training Adversarially Robust Neural Networks

@article{Zheltonozhskii2020ColoredNI,
  title={Colored Noise Injection for Training Adversarially Robust Neural Networks},
  author={Evgenii Zheltonozhskii and Chaim Baskin and Yaniv Nemcovsky and Brian Chmiel and Avi Mendelson and Alexander M. Bronstein},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.02188}
}
Even though deep learning has shown unmatched performance on various tasks, neural networks have been shown to be vulnerable to small adversarial perturbations of the input that lead to significant performance degradation. In this work we extend the idea of adding white Gaussian noise to the network weights and activations during adversarial training (PNI) to the injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and… 

Figures and Tables from this paper

Smoothed Inference for Adversarially-Trained Models

This work examines the application of randomized smoothing as a way to improve performance on unperturbed data as well as to increase robustness to adversarial attacks, and finds it lends itself well for trading-off between the model inference complexity and its performance.

On the Algorithmic Stability of Adversarial Training

It is argued that the non-differentiability issue of adversarial training causes worse algorithmic stability than their natural counterparts, and a noise injection method is considered to tackle this problem, which enables the training trajectory to avoid the occurrence of non- differentiability with dominating probability.

Resurrecting Trust in Facial Recognition: Mitigating Backdoor Attacks in Face Recognition to Prevent Potential Privacy Breaches

Comparisons with existing approaches show that BA-BAM provides a more practical backdoor mitigation approach for face recognition and is highly robust and incurs a maximal accuracy drop of 2.4%, while reducing the attack success rate to a maximum of 20%.

References

SHOWING 1-10 OF 27 REFERENCES

Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness Against Adversarial Attack

Parametric-Noise-Injection (PNI) is proposed which involves trainable Gaussian noise injection at each layer on either activation or weights through solving the Min-Max optimization problem, embedded with adversarial training, and effectively improves DNN's robustness against adversarial attack.

ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models

An effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN is proposed, sparing the need for training substitute models and avoiding the loss in attack transferability.

Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets

Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails

Training Robust Deep Neural Networks via Adversarial Noise Propagation

A simple yet powerful training algorithm, named Adversarial Noise Propagation (ANP), which injects noise into the hidden layers in a layer-wise manner and enables the strong robustness for deep models against both adversarial and corrupted ones, and also significantly outperforms various adversarial defense methods.

Defending against Whitebox Adversarial Attacks via Randomized Discretization

A simple and computationally efficient defense strategy: inject random Gaussian noise, discretize each pixel, and then feed the result into any pre-trained classifier, leading to a lower bound on the classification accuracy of any classifier against any (potentially whitebox) $L_\infty$-bounded adversarial attack.

Explaining and Harnessing Adversarial Examples

It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

Towards Deep Learning Models Resistant to Adversarial Attacks

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Delving into Transferable Adversarial Examples and Black-box Attacks

This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.

Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses

An efficient approach is proposed to generate gradient-based attacks that induce misclassifications with low L2 norm, by decoupling the direction and the norm of the adversarial perturbation that is added to the image.

Efficient Defenses Against Adversarial Attacks

This work proposes a new defense method based on practical observations which is easy to integrate into models and performs better than state-of-the-art defenses against adversarial attacks against deep neural networks.