• Corpus ID: 38308747


  author={Sandy H. Huang and Nicolas Papernot and Ian J. Goodfellow and Yan Duan and P. Abbeel},
Machine learning classifiers are known to be vulnerable to inputs maliciously constructed by adversaries to force misclassification. Such adversarial examples have been extensively studied in the context of computer vision applications. In this work, we show adversarial attacks are also effective when targeting neural network policies in reinforcement learning. Specifically, we show existing adversarial example crafting techniques can be used to significantly degrade test-time performance of… 

Figures from this paper


Adversarial Machine Learning at Scale
This research applies adversarial training to ImageNet and finds that single-step attacks are the best for mounting black-box attacks, and resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples.
Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples
This work introduces the first practical demonstration that cross-model transfer phenomenon enables attackers to control a remotely hosted DNN with no access to the model, its parameters, or its training data, and introduces the attack strategy of fitting a substitute model to the input-output pairs in this manner, then crafting adversarial examples based on this auxiliary model.
Explaining and Harnessing Adversarial Examples
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks
This work establishes that reinforcement learning techniques based on Deep Q-Networks are also vulnerable to adversarial input perturbations, and presents a novel class of attacks based on this vulnerability that enable policy manipulation and induction in the learning process of DQNs.
Adversarial examples in the physical world
It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples.
Intriguing properties of neural networks
It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition
A novel class of attacks is defined: attacks that are physically realizable and inconspicuous, and allow an attacker to evade recognition or impersonate another individual, and a systematic method to automatically generate such attacks is developed through printing a pair of eyeglass frames.
Playing Atari with Deep Reinforcement Learning
This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Benchmarking Deep Reinforcement Learning for Continuous Control
This work presents a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, task with partial observations, and tasks with hierarchical structure.
Asynchronous Methods for Deep Reinforcement Learning
A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.