Robustness May Be at Odds with Accuracy
- Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, A. Madry
- Computer ScienceInternational Conference on Learning…
- 30 May 2018
It is shown that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization, and it is argued that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers.
Synthesizing Robust Adversarial Examples
- Anish Athalye, Logan Engstrom, Andrew Ilyas, K. Kwok
- Computer ScienceInternational Conference on Machine Learning
- 24 July 2017
The existence of robust 3D adversarial objects is demonstrated, and the first algorithm for synthesizing examples that are adversarial over a chosen distribution of transformations is presented, which synthesizes two-dimensional adversarial images that are robust to noise, distortion, and affine transformation.
Black-box Adversarial Attacks with Limited Queries and Information
- Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin
- Computer ScienceInternational Conference on Machine Learning
- 23 April 2018
This work defines three realistic threat models that more accurately characterize many real-world classifiers: the query-limited setting, the partial-information setting, and the label-only setting and develops new attacks that fool classifiers under these more restrictive threat models.
Adversarial Examples Are Not Bugs, They Are Features
- Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, A. Madry
- Computer ScienceNeural Information Processing Systems
- 6 May 2019
It is demonstrated that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans.
Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors
- Andrew Ilyas, Logan Engstrom, A. Madry
- Computer ScienceInternational Conference on Learning…
- 20 July 2018
A framework that conceptually unifies much of the existing work on black-box attacks is introduced, and it is demonstrated that the current state-of-the-art methods are optimal in a natural sense.
Noise or Signal: The Role of Image Backgrounds in Object Recognition
- Kai Y. Xiao, Logan Engstrom, Andrew Ilyas, A. Madry
- Computer ScienceInternational Conference on Learning…
- 17 June 2020
This work creates a toolkit for disentangling foreground and background signal on ImageNet images, and finds that models can achieve non-trivial accuracy by relying on the background alone, and more accurate models tend to depend on backgrounds less.
Exploring the Landscape of Spatial Robustness
- Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, A. Madry
- Computer ScienceInternational Conference on Machine Learning
- 7 December 2017
This work thoroughly investigate the vulnerability of neural network--based classifiers to rotations and translations and finds that, in contrast to the p-norm case, first-order methods cannot reliably find worst-case perturbations.
Do Adversarially Robust ImageNet Models Transfer Better?
- Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, A. Madry
- Computer ScienceNeural Information Processing Systems
- 16 July 2020
It is found that adversarially robust models, while less accurate, often perform better than their standard-trained counterparts when used for transfer learning, and this work focuses on adversARially robust ImageNet classifiers.
A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations
- Logan Engstrom, Dimitris Tsipras, Ludwig Schmidt, A. Madry
- Computer ScienceArXiv
- 7 December 2017
It is shown that neural networks are already vulnerable to significantly simpler - and more likely to occur naturally - transformations of the inputs, and that the current neural network-based vision models might not be as reliable as the authors tend to assume.
Image Synthesis with a Single (Robust) Classifier
- Shibani Santurkar, Andrew Ilyas, Dimitris Tsipras, Logan Engstrom, Brandon Tran, A. Madry
- Computer ScienceNeural Information Processing Systems
- 6 June 2019
It turns out that adversarial robustness is precisely what the authors need to directly manipulate salient features of the input to demonstrate the utility of robustness in the broader machine learning context.
...
...