• Corpus ID: 67855875

Quantifying Perceptual Distortion of Adversarial Examples

@article{Jordan2019QuantifyingPD,
  title={Quantifying Perceptual Distortion of Adversarial Examples},
  author={Matt Jordan and Naren Manoj and Surbhi Goel and Alexandros G. Dimakis},
  journal={ArXiv},
  year={2019},
  volume={abs/1902.08265}
}
Recent work has shown that additive threat models, which only permit the addition of bounded noise to the pixels of an image, are insufficient for fully capturing the space of imperceivable adversarial examples. For example, small rotations and spatial transformations can fool classifiers, remain imperceivable to humans, but have large additive distance from the original images. In this work, we leverage quantitative perceptual metrics like LPIPS and SSIM to define a novel threat model for… 
Improving Perceptual Quality of Adversarial Images Using Perceptual Distance Minimization and Normalized Variance Weighting
TLDR
Two separate attack agnostic methods to increase the perceptual quality, measured in terms of perceptual distance metric LPIPS, while preserving the target fooling rate are proposed.
Perceptually Guided Adversarial Perturbations
TLDR
This work proposes a novel framework for generating adversarial perturbations by explicitly incorporating a “perceptual quality ball” constraint in their formulation, and poses the adversarial example generation problem as a tractable convex optimization problem, with constraints taken from a mathematically amenable variant of the popular SSIM index.
Imperceptible Adversarial Examples by Spatial Chroma-Shift
TLDR
A spatial transformation based perturbation method to create adversarial examples by only modifying the color components of an input image is proposed and human visual perception studies validate that the examples are more natural looking and often indistinguishable from their original counterparts.
Sparse Adversarial Video Attacks with Spatial Transformations
TLDR
An adversarial attack strategy on videos, called DeepSAVA, is proposed, which includes both additive perturbation and spatial transformation by a unified optimisation framework, where the structural similarity index measure is adopted to measure the adversarial distance.
Can Perceptual Guidance Lead to Semantically Explainable Adversarial Perturbations?
TLDR
This work proposes a novel framework for generating adversarial perturbations by explicitly incorporating a “perceptual quality ball” constraint in the authors' formulation, and poses the adversarial example generation problem as a tractable convex optimization problem, with constraints taken from a mathematically amenable variant of the popular SSIM index.
Semantics Preserving Adversarial Examples
TLDR
This paper proposes a framework to create semantics preserving adversarial examples by developing a manifoldinvariant adversarial perturbation technique that induces the perturbed elements to remain in the manifold, while satisfying adversarial constraints.
Functional Adversarial Attacks
TLDR
It is shown that functional threat models can be combined with existing additive ($\ell_p$) threat models to generate stronger threat models that allow both small, individual perturbations and large, uniform changes to an input.
Toward Visual Distortion in Black-Box Attacks
TLDR
This paper proposes a novel black-box attack approach that can directly minimize the induced distortion by learning the noise distribution of the adversarial example, assuming only loss-oracle access to the black- box network.
Adversarial Robustness Against the Union of Multiple Perturbation Models
TLDR
This work shows that it is indeed possible to adversarially train a robust model against a union of norm-bounded attacks, by using a natural generalization of the standard PGD-based procedure for adversarial training to multiple threat models.
Localized Uncertainty Attacks
The susceptibility of deep learning models to adversarial perturbations has stirred renewed attention in adversarial examples resulting in a number of attacks. However, most of these attacks fail to
...
...

References

SHOWING 1-10 OF 25 REFERENCES
On the Suitability of Lp-Norms for Creating and Preventing Adversarial Examples
TLDR
It is demonstrated that nearness of inputs as measured by Lp-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples.
Synthesizing Robust Adversarial Examples
TLDR
The existence of robust 3D adversarial objects is demonstrated, and the first algorithm for synthesizing examples that are adversarial over a chosen distribution of transformations is presented, which synthesizes two-dimensional adversarial images that are robust to noise, distortion, and affine transformation.
Spatially Transformed Adversarial Examples
TLDR
Perturbations generated through spatial transformation could result in large $\mathcal{L}_p$ distance measures, but the extensive experiments show that such spatially transformed adversarial examples are perceptually realistic and more difficult to defend against with existing defense systems.
A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations
TLDR
It is shown that neural networks are already vulnerable to significantly simpler - and more likely to occur naturally - transformations of the inputs, and that the current neural network-based vision models might not be as reliable as the authors tend to assume.
Towards Deep Learning Models Resistant to Adversarial Attacks
TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
One Pixel Attack for Fooling Deep Neural Networks
TLDR
This paper proposes a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE), which requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE.
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Adversarial examples in the physical world
TLDR
It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples.
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
TLDR
It is concluded that adversarialExamples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not.
Explaining and Harnessing Adversarial Examples
TLDR
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
...
...