• Corpus ID: 244773413

$\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training

@inproceedings{Dolatabadi2021ell\_inftyRobustnessAB,
  title={\$\ell\_\infty\$-Robustness and Beyond: Unleashing Efficient Adversarial Training},
  author={Hadi M. Dolatabadi and Sarah Monazam Erfani and Christopher Leckie},
  year={2021}
}
, Abstract. Neural networks are vulnerable to adversarial attacks: adding well-crafted, imperceptible perturbations to their input can modify their output. Adversarial training is one of the most effective approaches in training robust models against such attacks. However, it is much slower than vanilla training of neural networks since it needs to construct adversarial examples for the entire training data at every iteration, hampering its effectiveness. Recently, Fast Adversarial Training… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 45 REFERENCES

Fast is better than free: Revisiting adversarial training

TLDR
It is made the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice.

Ensemble Adversarial Training: Attacks and Defenses

TLDR
This work finds that adversarial training remains vulnerable to black-box attacks, where perturbations computed on undefended models are transferred to a powerful novel single-step attack that escapes the non-smooth vicinity of the input data via a small random step.

Towards Deep Learning Models Resistant to Adversarial Attacks

TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

TLDR
Two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function are proposed and combined with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.

Functional Adversarial Attacks

TLDR
It is shown that functional threat models can be combined with existing additive ($\ell_p$) threat models to generate stronger threat models that allow both small, individual perturbations and large, uniform changes to an input.

Testing Robustness Against Unforeseen Adversaries

TLDR
This work introduces a total of four novel adversarial attacks to create ImageNet-UA's diverse attack suite, and demonstrates that, in comparison to Image net-UA, prevailing L_inf robustness assessments give a narrow account of model robustness.

Spatially Transformed Adversarial Examples

TLDR
Perturbations generated through spatial transformation could result in large $\mathcal{L}_p$ distance measures, but the extensive experiments show that such spatially transformed adversarial examples are perceptually realistic and more difficult to defend against with existing defense systems.

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

TLDR
Perceptual Adversarial Training against a perceptual attack gives robustness against many other types of adversarial attacks, and is the first adversarial defense with this property.

Explaining and Harnessing Adversarial Examples

TLDR
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

Robust Physical-World Attacks on Deep Learning Visual Classification

TLDR
This work proposes a general attack algorithm, Robust Physical Perturbations (RP2), to generate robust visual adversarial perturbations under different physical conditions and shows that adversarial examples generated using RP2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints.