• Corpus ID: 195776342

Accurate, reliable and fast robustness evaluation

@article{Brendel2019AccurateRA,
  title={Accurate, reliable and fast robustness evaluation},
  author={Wieland Brendel and Jonas Rauber and Matthias K{\"u}mmerer and Ivan Ustyuzhaninov and Matthias Bethge},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.01003}
}
Throughout the past five years, the susceptibility of neural networks to minimal adversarial perturbations has moved from a peculiar phenomenon to a core issue in Deep Learning. Despite much attention, however, progress towards more robust models is significantly impaired by the difficulty of evaluating the robustness of neural network models. Today's methods are either fast but brittle (gradient-based attacks), or they are fairly reliable but slow (score- and decision-based attacks). We here… 

Figures and Tables from this paper

Sampled Nonlocal Gradients for Stronger Adversarial Attacks
TLDR
Inspired by gradient sampling techniques from non-convex optimization, this work proposes to calculate the gradient direction of the adversarial attack as the weighted average over multiple points in the local vicinity, which increases their success rate and thus leads to more accurate robustness estimates.
Pruning in the Face of Adversaries
TLDR
This work evaluates the robustness of pruned models against L0, L2 and L∞ attacks for a wide range of attack strengths, several architectures, data sets, pruning methods, and compression rates, and confirms that neural network pruning and adversarial robustness are not mutually exclusive.
Dynamically Sampled Nonlocal Gradients for Stronger Adversarial Attacks
TLDR
Dynamically Sampled Nonlocal Gradient Descent (DSNGD) calculates the gradient direction of the adversarial attack as the weighted average over past gradients of the optimization history, and empirically shows that by incorporating this nonlocal gradient information, it is able to give a more accurate estimation of the global descent direction on noisy and non-convex loss surfaces.
Improving Adversarial Robustness Through Progressive Hardening
TLDR
Adversarial Training with Early Stopping with ATES stabilizes network training even for a large perturbation norm and allows the network to operate at a better clean accuracy versus robustness trade-off curve compared to AT.
Improved Gradient based Adversarial Attacks for Quantized Networks
TLDR
This work systematically study the robustness of quantized networks against gradient based adversarial attacks and demonstrates that these quantized models suffer from gradient vanishing issues and show a fake sense of security.
Selection of Source Images Heavily Influences the Effectiveness of Adversarial Attacks
TLDR
A large-scale model-to-model transferability scenario is devised for which the properties of adversarial examples, generated from every suitable source image in ImageNet by making use of two of the most frequently deployed attacks, are analyzed.
How to Compare Adversarial Robustness of Classifiers from a Global Perspective
TLDR
It is shown that point-wise measures fail to capture important global properties that are essential to reliably compare the robustness of different classifiers, and new ways in which robustness curves can be used to systematically uncover these properties are introduced.
Learning to Generate Noise for Multi-Attack Robustness
TLDR
A novel meta-learning framework that explicitly learns to generate noise to improve the model’s robustness against multiple types of attacks, and validates the robustness of models trained by the scheme on various datasets and against a wide variety of perturbations.
On Global-view Based Defense via Adversarial Attack and Defense Risk Guaranteed Bounds
TLDR
A novel class of f-divergences suitable for measuring divergence among multiple distributions is developed and proposed that can provide a better understanding of the behaviors of those parties from the attack anddefense perspectives, thereby having important implications to both attack and defense sides.
Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints
TLDR
A fast minimum-norm (FMN) attack that works with different p -norm perturbation models, is robust to hyperparameter choices, does not require adversarial starting points, and converges within few lightweight steps is proposed.
...
...

References

SHOWING 1-10 OF 26 REFERENCES
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Towards the first adversarially robust neural network model on MNIST
TLDR
A novel robust classification model that performs analysis by synthesis using learned class-conditional data distributions is presented and it is demonstrated that most adversarial examples are strongly perturbed towards the perceptual boundary between the original and the adversarial class.
MixTrain: Scalable Training of Formally Robust Neural Networks
TLDR
Stochastic robust approximation and dynamic mixed training are proposed to drastically improve the efficiency of verifiably robust training without sacrificing verified robustness, and MixTrain can achieve up to 95.2% verified robust accuracy against norm-bounded attackers.
Towards Deep Learning Models Resistant to Adversarial Attacks
TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses
TLDR
An efficient approach is proposed to generate gradient-based attacks that induce misclassifications with low L2 norm, by decoupling the direction and the norm of the adversarial perturbation that is added to the image.
Boosting Adversarial Attacks with Momentum
TLDR
A broad class of momentum-based iterative algorithms to boost adversarial attacks by integrating the momentum term into the iterative process for attacks, which can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples.
Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models
TLDR
The Boundary Attack is introduced, a decision-based attack that starts from a large adversarial perturbations and then seeks to reduce the perturbation while staying adversarial and is competitive with the best gradient-based attacks in standard computer vision tasks like ImageNet.
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
TLDR
Elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art$L_2$ attack as a special case, suggesting novel insights on leveraging $L-1$ distortion in adversarial machine learning and security implications ofDNNs.
Provable defenses against adversarial examples via the convex outer adversarial polytope
TLDR
A method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations, and it is shown that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss.
The Limitations of Deep Learning in Adversarial Settings
TLDR
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
...
...