The Space of Adversarial Strategies
@article{Sheatsley2022TheSO, title={The Space of Adversarial Strategies}, author={Ryan Sheatsley and Blaine Hoak and Eric Pauley and Patrick Mcdaniel}, journal={ArXiv}, year={2022}, volume={abs/2209.04521} }
Adversarial examples , inputs designed to induce worst-case behavior in machine learning models, have been extensively studied over the past decade. Yet, our understanding of this phenomenon stems from a rather fragmented pool of knowledge; at present, there are a handful of attacks, each with disparate assumptions in threat models and incomparable def-initions of optimality. In this paper, we propose a systematic approach to characterize worst-case (i.e., optimal) adversaries. We first…
Figures and Tables from this paper
References
SHOWING 1-10 OF 58 REFERENCES
The Space of Transferable Adversarial Examples
- Computer ScienceArXiv
- 2017
It is found that adversarial examples span a contiguous subspace of large (~25) dimensionality, which indicates that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks.
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
- Computer ScienceICML
- 2020
Two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function are proposed and combined with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
Towards Deep Learning Models Resistant to Adversarial Attacks
- Computer ScienceICLR
- 2018
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
On the Robustness of Domain Constraints
- Computer ScienceCCS
- 2021
This paper develops techniques to learn domain constraints from data, and shows how the learned constraints can be integrated into the adversarial crafting process and evaluates the efficacy of the approach in network intrusion and phishing datasets.
Automated Discovery of Adaptive Attacks on Adversarial Defenses
- Computer ScienceNeurIPS
- 2021
This work presents an extensible framework that defines a search space over a set of reusable building blocks and automatically discovers an effective attack on a given model with an unknown defense by searching over suitable combinations of these blocks.
Intriguing Properties of Adversarial ML Attacks in the Problem Space
- Computer Science2020 IEEE Symposium on Security and Privacy (SP)
- 2020
A novel formalization for adversarial ML evasion attacks in the problem-space is proposed, which includes the definition of a comprehensive set of constraints on available transformations, preserved semantics, robustness to preprocessing, and plausibility.
Domain Knowledge Alleviates Adversarial Attacks in Multi-Label Classifiers
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2022
This paper shows how to implement an adaptive attack exploiting knowledge of the constraints and provides experimental comparisons with popular state-of-the-art attacks, believing that this approach may provide a significant step towards designing more robust multi-label classifiers.
Black-box Adversarial Attacks with Limited Queries and Information
- Computer ScienceICML
- 2018
This work defines three realistic threat models that more accurately characterize many real-world classifiers: the query-limited setting, the partial-information setting, and the label-only setting and develops new attacks that fool classifiers under these more restrictive threat models.
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
- Computer ScienceICML
- 2020
A new white-box adversarial attack for neural networks-based classifiers aiming at finding the minimal perturbation necessary to change the class of a given input, which performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient masking.