• Corpus ID: 4730292

Robust Physical-World Attacks on Deep Learning Models

@article{Evtimov2017RobustPA,
  title={Robust Physical-World Attacks on Deep Learning Models},
  author={I. Evtimov and Kevin Eykholt and Earlence Fernandes and Tadayoshi Kohno and Bo Li and Atul Prakash and Amir Rahmati and Dawn Xiaodong Song},
  journal={arXiv: Cryptography and Security},
  year={2017}
}
Recent studies show that the state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input. Given that that emerging physical systems are using DNNs in safety-critical situations, adversarial examples could mislead these systems and cause dangerous situations.Therefore, understanding adversarial examples in the physical world is an important step towards developing resilient learning algorithms. We propose a… 

Figures and Tables from this paper

Simple Physical Adversarial Examples against End-to-End Autonomous Driving Models
TLDR
These are the first end-to-end attacks on autonomous driving in simulation, using simple physically realizable attacks: the painting of black lines on the road that target deep neural network models for end- to-end autonomous driving control.
Enhancing ML Robustness Using Physical-World Constraints
TLDR
The results on the KITTI and GTSRB datasets show that the robustness against physical attacks at minimal harm to accuracy is improved, and a hierarchical classification paradigm is designed that enforces invariants that limit the attacker's action space.
Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks
TLDR
This work presents a new adversarial example attack crafting method, which takes the human perceptual system into consideration and maximizes the noise tolerance of the crafted adversarial Example, and demonstrates the efficacy of the proposed technique.
Generating Adversarial Examples with Adversarial Networks
TLDR
Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks, and have placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.
Improved Methodology for Evaluating Adversarial Robustness in Deep Neural Networks
  • Lee
  • Computer Science
  • 2020
TLDR
This work identifies three common cases that lead to overestimation of accuracy against perturbed examples generated by bounded first-order attack methods and proposes compensation methods that address sources of inaccurate gradient computation, such as numerical saturation for near zero values and nondifferentiability.
Non-Determinism in Neural Networks for Adversarial Robustness
TLDR
A novel neural network paradigm is proposed that falls under the category of randomized models for adversarial robustness, but differs from all existing techniques under this category in that it models each parameter of the network as a statistical distribution with learnable parameters.
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
TLDR
The authors' elastic-net attacks to DNNs (EAD) feature L1-oriented adversarial examples and include the state-of-the-art L2 attack as a special case, suggesting novel insights on leveraging L1 distortion in adversarial machine learning and security implications ofDNNs.
Adversarial Neon Beam: Robust Physical-World Adversarial Attack to DNNs
TLDR
This work proposes an attack method called adversarial neon beam (AdvNB), which can execute the physical attack by obtaining the physical parameters of adversarial Neon beam with very few queries, and can achieve better physical perturbation concealment.
Adversarial Examples: Attacks and Defenses for Deep Learning
TLDR
The methods for generating adversarial examples for DNNs are summarized, a taxonomy of these methods is proposed, and three major challenges in adversarialExamples are discussed and the potential solutions are discussed.
Generating Adversarial Fragments with Adversarial Networks for Physical-world Implementation
TLDR
FragGAN is the first approach that can implement effective and clean-label physical-world attacks against image classification, as the resulting attacks may succeed even without modifying any essential content of an image.
...
...

References

SHOWING 1-10 OF 55 REFERENCES
Adversarial Attacks on Neural Network Policies
TLDR
This work shows existing adversarial example crafting techniques can be used to significantly degrade test-time performance of trained policies, even with small adversarial perturbations that do not interfere with human perception.
The Limitations of Deep Learning in Adversarial Settings
TLDR
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Adversarial examples in the physical world
TLDR
It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples.
Towards Deep Learning Models Resistant to Adversarial Attacks
TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Note on Attacking Object Detectors with Adversarial Stickers
TLDR
An algorithm is designed that produces physical adversarial inputs, which can fool the YOLO object detector and can also attack Faster-RCNN with relatively high success rate based on transferability.
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Delving into Transferable Adversarial Examples and Black-box Attacks
TLDR
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.
Adversarial Examples for Generative Models
TLDR
This work explores methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN and presents three classes of attacks, motivating why an attacker might be interested in deploying such techniques against a target generative network.
Universal Adversarial Perturbations
TLDR
The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers and outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.
Synthesizing Robust Adversarial Examples
TLDR
The existence of robust 3D adversarial objects is demonstrated, and the first algorithm for synthesizing examples that are adversarial over a chosen distribution of transformations is presented, which synthesizes two-dimensional adversarial images that are robust to noise, distortion, and affine transformation.
...
...