Adversarial Transformation Networks: Learning to Generate Adversarial Examples
@article{Baluja2017AdversarialTN, title={Adversarial Transformation Networks: Learning to Generate Adversarial Examples}, author={Shumeet Baluja and Ian S. Fischer}, journal={ArXiv}, year={2017}, volume={abs/1703.09387} }
Multiple different approaches of generating adversarial examples have been proposed to attack deep neural networks. [] Key Method We call such a network an Adversarial Transformation Network (ATN). ATNs are trained to generate adversarial examples that minimally modify the classifier's outputs given the original input, while constraining the new classification to match an adversarial target class. We present methods to train ATNs and analyze their effectiveness targeting a variety of MNIST classifiers as…
Figures and Tables from this paper
236 Citations
Generating Adversarial Examples with Adversarial Networks
- Computer ScienceIJCAI
- 2018
Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks, and have placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.
A Direct Approach to Robust Deep Learning Using Adversarial Networks
- Computer ScienceICLR
- 2019
This paper model the adversarial noise using a generative network, trained jointly with a classification discriminative network as a minimax game, and shows empirically that this adversarial network approach works well against black box attacks, with performance on par with state-of-art methods such as ensemble adversarial training and adversarialTraining with projected gradient descent.
AT-GAN: AN ADVERSARIAL GENERATIVE MODEL
- Computer Science
- 2020
AT-GAN (Adversarial Transfer on Generative Adversarial Net) is proposed to train an adversarial generative model that can directly produce adversarial examples and can efficiently generate diverse adversarialExamples that are realistic to human perception, and yields higher attack success rates against adversarially trained models.
Generating Adversarial Examples with Graph Neural Networks
- Computer ScienceUAI
- 2021
It is shown that this method beats state-of-the-art adversarial attacks, including PGD-attack, MI-FGSM, and Carlini and Wagner attack, reducing the time required to generate adversarial examples with small perturbation norms by over 65% and achieves good generalization performance on unseen networks.
Generalizable Adversarial Attacks Using Generative Models
- Computer ScienceArXiv
- 2019
This work frames the problem as learning a distribution of adversarial perturbations, enabling it to generate diverse adversarial distributions given an unperturbed input, and shows that this framework is domain-agnostic in that the same framework can be employed to attack different input domains with minimal modification.
Adversarial Defense via Learning to Generate Diverse Attacks
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes a recursive and stochastic generator that produces much stronger and diverse perturbations that comprehensively reveal the vulnerability of the target classifier.
Constructing Unrestricted Adversarial Examples with Generative Models
- Computer ScienceNeurIPS
- 2018
The empirical results on the MNIST, SVHN, and CelebA datasets show that unrestricted adversarial examples can bypass strong adversarial training and certified defense methods designed for traditional adversarial attacks.
NAG: Network for Adversary Generation
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
Perturbations crafted by the proposed generative approach to model the distribution of adversarial perturbations achieve state-of-the-art fooling rates, exhibit wide variety and deliver excellent cross model generalizability.
Detecting Adversarial Examples Through Image Transformation
- Computer ScienceAAAI
- 2018
An effective method to detect adversarial examples in image classification by introducing randomness in the process of image transformation, which can achieve a detection ratio of around 70%.
Machine Learning as an Adversarial Service: Learning Black-Box Adversarial Examples
- Computer ScienceArXiv
- 2017
A direct attack against black-box neural networks, that uses another attacker neural network to learn to craft adversarial examples that can transfer to different machine learning models such as Random Forest, SVM, and K-Nearest Neighbor is introduced.
References
SHOWING 1-10 OF 35 REFERENCES
The Limitations of Deep Learning in Adversarial Settings
- Computer Science2016 IEEE European Symposium on Security and Privacy (EuroS&P)
- 2016
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Adversarial Machine Learning at Scale
- Computer ScienceICLR
- 2017
This research applies adversarial training to ImageNet and finds that single-step attacks are the best for mounting black-box attacks, and resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples.
Explaining and Harnessing Adversarial Examples
- Computer ScienceICLR
- 2015
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
On Detecting Adversarial Perturbations
- Computer ScienceICLR
- 2017
It is shown empirically that adversarial perturbations can be detected surprisingly well even though they are quasi-imperceptible to humans.
Towards Evaluating the Robustness of Neural Networks
- Computer Science2017 IEEE Symposium on Security and Privacy (SP)
- 2017
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Delving into Transferable Adversarial Examples and Black-box Attacks
- Computer ScienceICLR
- 2017
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.
Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples
- Computer ScienceArXiv
- 2016
This work introduces the first practical demonstration that cross-model transfer phenomenon enables attackers to control a remotely hosted DNN with no access to the model, its parameters, or its training data, and introduces the attack strategy of fitting a substitute model to the input-output pairs in this manner, then crafting adversarial examples based on this auxiliary model.
Adversarial examples in the physical world
- Computer ScienceICLR
- 2017
It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples.
Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples
- Computer ScienceArXiv
- 2016
New transferability attacks between previously unexplored (substitute, victim) pairs of machine learning model classes, most notably SVMs and decision trees are introduced.
Generative Adversarial Nets
- Computer ScienceNIPS
- 2014
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a…