Semantic Adversarial Examples

  title={Semantic Adversarial Examples},
  author={Hossein Hosseini and Radha Poovendran},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  • Hossein Hosseini, R. Poovendran
  • Published 16 March 2018
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Deep neural networks are known to be vulnerable to adversarial examples, i.e., images that are maliciously perturbed to fool the model. Generating adversarial examples has been mostly limited to finding small perturbations that maximize the model prediction error. Such images, however, contain artificial perturbations that make them somewhat distinguishable from natural images. This property is used by several defense methods to counter adversarial examples by applying denoising filters or… 

Figures from this paper

Unrestricted Adversarial Examples via Semantic Manipulation
This paper introduces "unrestricted" perturbations that manipulate semantically meaningful image-based visual descriptors -- color and texture -- in order to generate effective and photorealistic adversarial examples.
Semantic Adversarial Perturbations using Learnt Representations
This work introduces a novel method for the construction of a rich new class of semantic adversarial examples that perturbs the pose, location, size, shape, colour and texture of the objects in an image without manual encoding of these concepts.
Generating Unrestricted Adversarial Examples via Three Parameters
The proposed adversarial attack generates an unrestricted adversarial example with a limited number of parameters, which preserves the image appearance and improves model robustness against a randomly transformed image.
Generating Semantic Adversarial Examples with Differentiable Rendering
Machine learning (ML) algorithms, especially deep neural networks, have demonstrated success in several domains. However, several types of attacks have raised concerns about deploying ML in
Semantically Adversarial Learnable Filters
A semantic adversarial loss is introduced that guides the training of a fully convolutional neural network to generate adversarial images that will be classified with a label that is semantically different from the label of the original (clean) image.
Structure-Preserving Transformation: Generating Diverse and Transferable Adversarial Examples
Empirical results on the MNIST and the fashion-MNIST datasets show that adversarial examples generated by the SPT approach can easily bypass strong adversarial training and transfer well to other target models with no loss or little loss of successful attack rate.
ColorFool: Semantic Adversarial Colorization
This paper proposes a content-based black-box adversarial attack that generates unrestricted perturbations by exploiting image semantics to selectively modify colors within chosen ranges that are perceived as natural by humans and outperforms in terms of success rate, robustness to defense frameworks and transferability.
Generative Adversarial Examples
The empirical results on the MNIST, SVHN, and CelebA datasets show that generative adversarial examples can easily bypass strong adversarial training and certified defense methods which can foil existing adversarial attacks.
Constructing Unrestricted Adversarial Examples with Generative Models
The empirical results on the MNIST, SVHN, and CelebA datasets show that unrestricted adversarial examples can bypass strong adversarial training and certified defense methods designed for traditional adversarial attacks.
Structure Matters: Towards Generating Transferable Adversarial Images
The novel concepts of structure patterns and structure-aware perturbations that relax the small perturbation constraint while still keeping images natural are proposed for generating natural adversarial examples with extremely high transferability.


Generating Natural Adversarial Examples
This paper proposes a framework to generate natural and legible adversarial examples that lie on the data manifold, by searching in semantic space of dense and continuous data representation, utilizing the recent advances in generative adversarial networks.
PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples
Adversarial perturbations of normal images are usually imperceptible to humans, but they can seriously confuse state-of-the-art machine learning models. What makes them so special in the eyes of
Ensemble Adversarial Training: Attacks and Defenses
This work finds that adversarial training remains vulnerable to black-box attacks, where perturbations computed on undefended models are transferred to a powerful novel single-step attack that escapes the non-smooth vicinity of the input data via a small random step.
Towards Deep Learning Models Resistant to Adversarial Attacks
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Explaining and Harnessing Adversarial Examples
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations
It is shown that neural networks are already vulnerable to significantly simpler - and more likely to occur naturally - transformations of the inputs, and that the current neural network-based vision models might not be as reliable as the authors tend to assume.
Adversarial Machine Learning at Scale
This research applies adversarial training to ImageNet and finds that single-step attacks are the best for mounting black-box attacks, and resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples.
Towards Evaluating the Robustness of Neural Networks
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Adversarial Patch
A method to create universal, robust, targeted adversarial image patches in the real world, which can be printed, added to any scene, photographed, and presented to image classifiers; even when the patches are small, they cause the classifiers to ignore the other items in the scene and report a chosen target class.
On the Limitation of Convolutional Neural Networks in Recognizing Negative Images
Whether CNNs are capable of learning the semantics of training data is examined, and it is conjecture that current training methods do not effectively train models to generalize the concepts.