• Corpus ID: 224705419

RobustBench: a standardized adversarial robustness benchmark

  title={RobustBench: a standardized adversarial robustness benchmark},
  author={Francesco Croce and Maksym Andriushchenko and Vikash Sehwag and Edoardo Debenedetti and Edoardo Debenedetti and Mung Chiang and Prateek Mittal and Matthias Hein},
Evaluation of adversarial robustness is often error-prone leading to overestimation of the true robustness of models. While adaptive attacks designed for a particular defense are a way out of this, there are only approximate guidelines on how to perform them. Moreover, adaptive evaluations are highly customized for particular models, which makes it difficult to compare different defenses. Our goal is to establish a standardized benchmark of adversarial robustness, which as accurately as… 

Multiple Perturbation Attack: Attack Pixelwise Under Different $\ell_p$-norms For Better Adversarial Performance

This work learns how to perturb each pixel to maximize the attack performance, while maintaining the overall visual imperceptibility of adversarial examples, and shows that its method outperforms most current strong attacks across state-of-the-art defense mechanisms, while retaining its ability to remain clean visually.

Reliable Robustness Evaluation via Automatically Constructed Attack Ensembles

This work presents AutoAE, a concep- tually simple approach for automatically constructing AEs, and shows itself as a reliable evaluation protocol for adversarial robustness, which further indicates the huge potential of automatic AE construction.

A Unified Wasserstein Distributional Robustness Framework for Adversarial Training

A unified framework that connects Wasserstein distributional robustness with current state-of-the-art AT methods and demonstrates that standard AT methods are special cases of their counterparts in this framework, and leads to an intuitive relaxation and generalization of existing AT methods.

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

This paper argues that the alternation of data by AutoAttack with l ∞ , (cid:15) = 8 / 255 is unrealistically strong, and results on low resolution data sets like CIFAR10 do not generalize well to higher resolution images as gradient based attacks appear to become even more detectable with increasing resolutions.

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

This paper formulate a Margin Decomposition attack that decomposes a margin loss into individual terms and then explores the attackability of these terms separately via a two-stage process and proposes a MultiTargeted and an ensemble version of the MD attack.

A Light Recipe to Train Robust Vision Transformers

This paper shows that ViTs are highly suitable for adversarial training to achieve competitive performance and recommends that the community should avoid translating the canonical training recipes in ViTs to robust training and rethink common training choices in the context of adversarialTraining.

Supplementary Material for Relating Adversarially Robust Generalization to Flat Minima

It is found that even simple regularization schemes, e.g., AutoAugment, weight decay or label noise, also improve robustness by finding flatter minima, and approaches known to improve flatness also improve adversarial robustness.

ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches

This work proposes ImageNet-Patch, a dataset to benchmark machine-learning models against adversarial patches, a set of patches, optimized to generalize across different models, and applied to ImageNet data after preprocessing them with affine transformations, enabling an approxi-mate yet faster robustness evaluation.

Understanding the Impact of Adversarial Robustness on Accuracy Disparity

It is demonstrated that while the constraint of adversarial robustness consistently degrades the standard accuracy in the balanced class setting, the class imbalance ratio plays a fundamentally different role in accuracy disparity compared to the Gaussian case, due to the heavy tail of the stable distribution.

DISCO: Adversarial Defense with Local Implicit Functions

A novel adversarial defenses for image classification with local Implicit Module DISCO is proposed to remove adversarial perturbations by localized manifold projections and is shown to be data and parameter efficient and to mount defenses that transfers across datasets, classifiers and attacks.



ImageNet: A large-scale hierarchical image database

A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

This paper standardizes and expands the corruption robustness topic, while showing which classifiers are preferable in safety-critical applications, and proposes a new dataset called ImageNet-P which enables researchers to benchmark a classifier's robustness to common perturbations.

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

This work identifies obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples, and develops attack techniques to overcome this effect.

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

Two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function are proposed and combined with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.

Fixing Data Augmentation to Improve Adversarial Robustness

It is demonstrated that, contrary to previous findings, when combined with model weight averaging, data augmentation can significantly boost robust accuracy and state-of-the-art generative models can be leveraged to artificially increase the size of the training set and improve adversarial robustness.

A critique of the DeepSec Platform for Security Analysis of Deep Learning Models

At IEEE SP attacks and defenses are evaluated using incorrect implementations that greatly under-estimate their effectiveness and the robustness of each defense is evaluated as an average, not based on the most effective attack against that defense.

Theoretically Principled Trade-off between Robustness and Accuracy

The prediction error for adversarial examples (robust error) is decompose as the sum of the natural (classification) error and boundary error, and a differentiable upper bound is provided using the theory of classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors.

Automatic differentiation in PyTorch

An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.

Towards Deep Learning Models Resistant to Adversarial Attacks

This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.