One Pixel Attack for Fooling Deep Neural Networks

@article{Su2019OnePA,
  title={One Pixel Attack for Fooling Deep Neural Networks},
  author={Jiawei Su and Danilo Vasconcellos Vargas and Kouichi Sakurai},
  journal={IEEE Transactions on Evolutionary Computation},
  year={2019},
  volume={23},
  pages={828-841}
}
Recent research has revealed that the output of deep neural networks (DNNs) can be easily altered by adding relatively small perturbations to the input vector. [] Key Method It requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE. The results show that 67.97% of the natural images in Kaggle CIFAR-10 test dataset and 16.04% of the ImageNet (ILSVRC 2012) test images can be perturbed to at least one target class by modifying just one…
One Sparse Perturbation to Fool them All, almost Always!
TLDR
The proposed DEceit algorithm for constructing effective universal pixel-restricted perturbations using only black-box feedback from the target network achieves a commendable and highly transferable Fooling Rate while retaining the visual quality.
XGAN: adversarial attacks with GAN
TLDR
A method of generating target images directly using GAN to achieve a method of attacking deep neural networks and achieves an 82% success rate on the black-box target attack on the cifar10 dataset and the MNIST dataset.
ABCAttack: A Gradient-Free Optimization Black-Box Attack for Fooling Deep Image Classifiers
TLDR
The experimental results show that the proposed ABCAttack can not only obtain a high attack success rate with fewer queries in the black-box setting, but also break some existing defenses to a large extent, and is not limited by model structure or size, which provides further research directions for deep learning evasion attacks and defenses.
Evaluating and Improving Adversarial Attacks on DNN-Based Modulation Recognition
TLDR
This study designs the Fitting Difference as a metric to measure the perturbed waveforms and proposes a new method: the Nesterov Adam Iterative Method to generate adversarial examples, which shows that the proposed algorithm not only exerts excellent white-box attacks but also can initiate attacks on a black-box model.
Exploring and Expanding the One-Pixel Attack
TLDR
This work improves upon the original implementation of the attack by designing a fitness function to minimize the magnitude of the perturbation in addition to the network confidence, and explores the attack’s efficacy by comparing its performance in neural networks of different depths, and analyzing the technique by computing per-pixel heatmaps of vulnerabilities in input images.
Towards Imperceptible Adversarial Image Patches Based on Network Explanations
TLDR
A novel method to generate local region perturbations to find the contributing feature regions (CFRs) of images based on network explanations for perturbation, using a soft mask matrix to represent the CFRs for finely characterizing the contributions of each pixel.
Adversarial attack to fool object detector
TLDR
This work proposes a simple and computationally efficient technique in terms of the average number of iterations that is, Plug-n-Play Adversarial Attack (PPAA) in which constrained uniform random noises are used to generate perturbations.
Testing Convolutional Neural Network using Adversarial Attacks on Potential Critical Pixels
TLDR
An adversarial attack method, called Dual Iterative Fusion (DIF) with potential critical pixels, for CNN testing to reveal the vulnerability of CNNs, and observes that some classes are more vulnerable than the others within many classical CNNs for image classification.
Empirical Evaluation on Robustness of Deep Convolutional Neural Networks Activation Functions Against Adversarial Perturbation
TLDR
It is shown that the choosing of activation functions can be an important factor that influences the robustness of DCNN and that comparing with sigmoid, the ReLu non-linearity is more vulnerable which allows the low dimensional one-pixel attack to exploit much higher success rate and confidence of launching the attack.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 73 REFERENCES
Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction
TLDR
This paper proposes a straightforward method for detecting adversarial image examples, which can be directly deployed into unmodified off-the-shelf DNN models and raises the bar for defense-aware attacks.
Simple Black-Box Adversarial Attacks on Deep Neural Networks
TLDR
This work focuses on deep convolutional neural networks and demonstrates that adversaries can easily craft adversarial examples even without any internal knowledge of the target network, and proposes schemes that could serve as a litmus test for designing robust networks.
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks
TLDR
Two feature squeezing methods are explored: reducing the color bit depth of each pixel and spatial smoothing, which are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.
Adversarial Perturbations Against Deep Neural Networks for Malware Classification
TLDR
This paper shows how to construct highly-effective adversarial sample crafting attacks for neural networks used as malware classifiers, and evaluates to which extent potential defensive mechanisms against adversarial crafting can be leveraged to the setting of malware classification.
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
TLDR
The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN, and analytically investigates the generalizability and robustness properties granted by the use of defensive Distillation when training DNNs.
The Limitations of Deep Learning in Adversarial Settings
TLDR
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Adversarial Diversity and Hard Positive Generation
TLDR
A new psychometric perceptual adversarial similarity score (PASS) measure for quantifying adversarial images, the notion of hard positive generation is introduced, and a novel hot/cold approach for adversarial example generation is presented, which provides multiple possible adversarial perturbations for every single image.
Universal Adversarial Perturbations
TLDR
The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers and outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.
Adversarial Examples: Attacks and Defenses for Deep Learning
TLDR
The methods for generating adversarial examples for DNNs are summarized, a taxonomy of these methods is proposed, and three major challenges in adversarialExamples are discussed and the potential solutions are discussed.
...
1
2
3
4
5
...