• Corpus ID: 209415135

n-ML: Mitigating Adversarial Examples via Ensembles of Topologically Manipulated Classifiers

  title={n-ML: Mitigating Adversarial Examples via Ensembles of Topologically Manipulated Classifiers},
  author={Mahmood Sharif and Lujo Bauer and Michael K. Reiter},
This paper proposes a new defense called $n$-ML against adversarial examples, i.e., inputs crafted by perturbing benign inputs by small amounts to induce misclassifications by classifiers. Inspired by $n$-version programming, $n$-ML trains an ensemble of $n$ classifiers, and inputs are classified by a vote of the classifiers in the ensemble. Unlike prior such approaches, however, the classifiers in the ensemble are trained specifically to classify adversarial examples differently, rendering it… 

Figures and Tables from this paper

Certifying Joint Adversarial Robustness for Model Ensembles
The robustness of various models ensembles, including models trained using cost-sensitive robustness to be diverse, is evaluated to improve understanding of the potential effectiveness of ensemble models as a defense against adversarial examples.
Ensemble-based Adversarial Defense Using Diversified Distance Mapping
  • Computer Science
  • 2020
It is demonstrated that the ensembles based on DMLs can achieve high benign accuracy while exhibiting robustness against adversarial attacks using multiple white-box techniques along with AutoAttack.
Adaptive Noise Injection for Training Stochastic Student Networks from Deterministic Teachers
This work presents a conceptually clear adaptive noise injection mechanism in combination with teacher-initialisation, which adjusts its degree of randomness dynamically through the computation of mini-batch statistics, embedded within a simple framework to obtain stochastic networks from existing deterministic networks.
Defense Through Diverse Directions
By encouraging the network to distribute evenly across inputs, the network becomes less susceptible to localized, brittle features which imparts a natural robustness to targeted perturbations.


The Limitations of Deep Learning in Adversarial Settings
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
It is demonstrated through extensive experimentation that this method consistently outperforms all existing provably $\ell-2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_ 2$-defenses.
On Detecting Adversarial Perturbations
It is shown empirically that adversarial perturbations can be detected surprisingly well even though they are quasi-imperceptible to humans.
Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models
The proposed Defense-GAN, a new framework leveraging the expressive capability of generative models to defend deep neural networks against adversarial perturbations, is empirically shown to be consistently effective against different attack methods and improves on existing defense strategies.
On the Suitability of Lp-Norms for Creating and Preventing Adversarial Examples
It is demonstrated that nearness of inputs as measured by Lp-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples.
Towards Robust Detection of Adversarial Examples
This paper presents a novel training procedure and a thresholding test strategy, towards robust detection of adversarial examples, and proposes to minimize the reverse cross-entropy (RCE), which encourages a deep network to learn latent representations that better distinguish adversarialExamples from normal ones.
Improving Transferability of Adversarial Examples With Input Diversity
This work proposes to improve the transferability of adversarial examples by creating diverse input patterns by applying random transformations to the input images at each iteration, and shows that the proposed attack method can generate adversarialExamples that transfer much better to different networks than existing baselines.
On the (Statistical) Detection of Adversarial Examples
It is shown that statistical properties of adversarial examples are essential to their detection, and they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests.
Explaining and Harnessing Adversarial Examples
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Detecting Adversarial Samples from Artifacts
This paper investigates model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model, and results show a method for implicit adversarial detection that is oblivious to the attack algorithm.