• Corpus ID: 59336268

Improving Adversarial Robustness of Ensembles with Diversity Training

@article{Kariyappa2019ImprovingAR,
  title={Improving Adversarial Robustness of Ensembles with Diversity Training},
  author={Sanjay Kariyappa and Moinuddin K. Qureshi},
  journal={ArXiv},
  year={2019},
  volume={abs/1901.09981}
}
Deep Neural Networks are vulnerable to adversarial attacks even in settings where the attacker has no direct access to the model being attacked. Such attacks usually rely on the principle of transferability, whereby an attack crafted on a surrogate model tends to transfer to the target model. We show that an ensemble of models with misaligned loss gradients can provide an effective defense against transfer-based attacks. Our key insight is that an adversarial example is less likely to fool… 

Figures and Tables from this paper

Evaluating Ensemble Robustness Against Adversarial Attacks

This paper introduces a gradient based measure of how effectively an ensemble's constituent models collaborate to reduce the space of adversarial examples targeting the ensemble itself and demonstrates that this measure can be utilized during training as to increase an ensemble’s robustness to adversarialExamples.

Improving Ensemble Robustness by Collaboratively Promoting and Demoting Adversarial Robustness

This work proposes a simple, but effective strategy to collaborate among committee models of an ensemble model via the secure and insecure sets defined for each model member on a given sample, which provides the flexibility to reduce the adversarial transferability and promote the diversity of ensemble members, which are two crucial factors for better robustness in this ensemble approach.

Resisting Adversarial Attacks in Deep Neural Networks using Diverse Decision Boundaries

This paper attempts to develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model, and presents extensive experimentations using standard image classification datasets against state-of-the-art adversarial attacks to demonstrate the robustness of the proposed ensemble- based defense.

DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles

DVERGE is proposed, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features, and diversifies the adversarian vulnerability to induce diverse outputs against a transfer attack, and enables the improved robustness when more sub-models are added to the ensemble.

Defending Adversarial Examples by Negative Correlation Ensemble

A new ensemble defense approach named the Negative Correlation Ensemble (NCEn), which achieves compelling results by introducing gradient directions and gradient magnitudes of each member in the ensemble negatively correlated and at the same time, reducing the transferability of adversarial examples among them.

More or Less (MoL): Defending against Multiple Perturbation Attacks on Deep Neural Networks through Model Ensemble and Compression

  • Hao ChengKaidi Xu R. Goldhahn
  • Computer Science
    2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)
  • 2022
This work investigates the design problem of deep learning systems robust to multiple perturbation attacks and adopts the model ensemble approach, where an ensemble of expert models dealing with various perturbated types are integrated through a trainable aggregator module.

MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack

MORA is introduced, a model-reweighing attack to steer adversarial example synthesis by reweighed the importance of sub-model gradients, finding that recent ensemble defenses all exhibit varying degrees of overestimated robustness.

Adversarial Vulnerability of Randomized Ensembles

This work proposes a theoretically-sound and efficient adversarial attack algorithm (ARC) capable of compromising random ensembles even in cases where adaptive PGD fails to do so, and empirically establishes that randomizedEnsembles are in fact more vulnerable to ℓ p -bounded adversarial perturbations than even standard AT models.

Exploiting Joint Robustness to Adversarial Perturbations

This paper exploits first-order interactions within ensembles to formalize a reliable and practical defense and presents a joint gradient phase and magnitude regularization (GPMR) as a vigorous approach to impose the desired scenario of interactions among members of the ensemble.

Building Robust Ensembles via Margin Boosting

This work takes a principled approach towards building robust ensembles and develops an algorithm for learning an ensemble with maximum margin, which not only outperforms existing ensembling techniques, but also large models trained in an end-to-end fashion.
...

References

SHOWING 1-10 OF 30 REFERENCES

Discovering Adversarial Examples with Momentum

A strong attack algorithm named momentum iterative fast gradient sign method (MI-FGSM) is proposed to discover adversarial examples and can serve as a benchmark attack algorithm for evaluating the robustness of various models and defense methods.

Ensemble Adversarial Training: Attacks and Defenses

This work finds that adversarial training remains vulnerable to black-box attacks, where perturbations computed on undefended models are transferred to a powerful novel single-step attack that escapes the non-smooth vicinity of the input data via a small random step.

Ensemble Methods as a Defense to Adversarial Perturbations Against Deep Neural Networks

It is empirically shown that ensemble methods not only improve the accuracy of neural networks on test data but also increase their robustness against adversarial perturbations.

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

The proposed Defense-GAN, a new framework leveraging the expressive capability of generative models to defend deep neural networks against adversarial perturbations, is empirically shown to be consistently effective against different attack methods and improves on existing defense strategies.

Towards Deep Learning Models Resistant to Adversarial Attacks

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Stochastic Activation Pruning for Robust Adversarial Defense

Stochastic Activation Pruning (SAP) is proposed, a mixed strategy for adversarial defense that prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate.

Adversarial Machine Learning at Scale

This research applies adversarial training to ImageNet and finds that single-step attacks are the best for mounting black-box attacks, and resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples.

Mitigating adversarial effects through randomization

This paper proposes to utilize randomization at inference time to mitigate adversarial effects, and uses two randomization operations: random resizing, which resizes the input images to a random size, and random padding, which pads zeros around the input image in a random manner.

Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks

The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN, and analytically investigates the generalizability and robustness properties granted by the use of defensive Distillation when training DNNs.

Explaining and Harnessing Adversarial Examples

It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.