• Corpus ID: 195791557

@inproceedings{Croce2020MinimallyDA,
author={Francesco Croce and Matthias Hein},
booktitle={ICML},
year={2020}
}
• Published in ICML 3 July 2019
• Computer Science
The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality…
158 Citations

## Figures and Tables from this paper

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
• Computer Science
ICML
• 2020
Two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function are proposed and combined with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
Output Diversified Initialization for Adversarial Attacks
• Computer Science
ArXiv
• 2020
Output Diversified Initialization (ODI), a novel random initialization strategy that can be combined with most existing white-box adversarial attacks, is proposed, which outperforms current state-of-the-art attacks against robust models and becomes much more efficient on several datasets.
• Computer Science
ArXiv
• 2022
A new adversarial training algorithm is proposed that is theoretically well motivated and empirically superior to other existing algorithms and a novel feature of the proposed algorithm is to use a data-adaptive regularization for robustifying a prediction model.
• Computer Science
AAAI
• 2022
The anti-adversary layer is proposed, which generates an input perturbation in the opposite direction of the adversarial one and feeds the classifier a perturbed version of the input.
Towards Transferable Adversarial Perturbations with Minimum Norm
• Computer Science
• 2021
This work proposes a geometryaware framework to generate transferable adversarial perturbation with minimum norm for each input, analogous to model selection in statistical machine learning, and leverages a validation model to select the optimal perturbations budget for each image.
• Computer Science
NeurIPS
• 2021
A fast minimum-norm (FMN) attack that works with different p -norm perturbation models, is robust to hyperparameter choices, does not require adversarial starting points, and converges within few lightweight steps is proposed.
• Computer Science
ArXiv
• 2020
This paper conducts experiments to understand the behavior of fast adversarial training and shows the key to its success is the ability to recover from overfitting to weak attacks, and extends the findings to improve fast adversaria training, demonstrating superior robust accuracy to strong adversarialTraining, with much-reduced training time.
Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks
• Computer Science
ICML
• 2022
This paper introduces several innovations that make white-box targeted attacks follow the intuition of the attacker’s goal: to trick the model to assign a higher probability to the target class than to any other, while staying within a specified distance from the original input.
Pixle: a fast and effective black-box attack based on rearranging pixels
• Computer Science
ArXiv
• 2022
This paper proposes a novel attack that can be performed without knowing the inner structure of the attacked model, nor the training procedure, and is capable of correctly attacking a high percentage of samples by rearranging a small number of pixels within the attacked image.
Localized Uncertainty Attacks
• Computer Science, Mathematics
ArXiv
• 2021
The susceptibility of deep learning models to adversarial perturbations has stirred renewed attention in adversarial examples resulting in a number of attacks. However, most of these attacks fail to

## References

SHOWING 1-10 OF 38 REFERENCES
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
• Computer Science
ICML
• 2020
Two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function are proposed and combined with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation
• Computer Science
NIPS
• 2017
This paper shows for the first time formal guarantees on the robustness of a classifier by giving instance-specific lower bounds on the norm of the input manipulation required to change the classifier decision.
Scaling up the Randomized Gradient-Free Adversarial Attack Reveals Overestimation of Robustness Using Established Attacks
• Computer Science
International Journal of Computer Vision
• 2019
This work significantly improves the randomized gradient-free attack for ReLU networks (Croce and Hein in GCPR, 2018), in particular by scaling it up to large networks, thus revealing an overestimation of the robustness by state-of-the-art methods.
Wasserstein Adversarial Examples via Projected Sinkhorn Iterations
• Computer Science
ICML
• 2019
A new threat model for adversarial attacks based on the Wasserstein distance is proposed, which can successfully attack image classification models, and it is demonstrated that PGD-based adversarial training can improve this adversarial accuracy to 76%.
Adversarial Training and Robustness for Multiple Perturbations
• Computer Science
NeurIPS
• 2019
It is proved that a trade-off in robustness to different types of $\ell_p$-bounded and spatial perturbations must exist in a natural and simple statistical setting, and questioned the viability and computational scalability of extending adversarial robustness, and adversarial training, to multiple perturbation types.
• Computer Science
AAAI
• 2019
Distributionally adversarial attack (DAA), a framework to solve an optimal adversarial-data distribution, a perturbed distribution that satisfies the L∞ constraint but deviates from the original data distribution to increase the generalization risk maximally is proposed.
Towards Evaluating the Robustness of Neural Networks
• Computer Science
2017 IEEE Symposium on Security and Privacy (SP)
• 2017
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Towards Deep Learning Models Resistant to Adversarial Attacks
• Computer Science
ICLR
• 2018
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Simple Black-Box Adversarial Perturbations for Deep Networks
• Computer Science
ArXiv
• 2016
This work focuses on deep convolutional neural networks and demonstrates that adversaries can easily craft adversarial examples even without any internal knowledge of the target network.
One Pixel Attack for Fooling Deep Neural Networks
• Computer Science
IEEE Transactions on Evolutionary Computation
• 2019
This paper proposes a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE), which requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE.