Crafting adversarial input sequences for recurrent neural networks

@article{Papernot2016CraftingAI,
  title={Crafting adversarial input sequences for recurrent neural networks},
  author={Nicolas Papernot and Patrick Mcdaniel and Ananthram Swami and Richard E. Harang},
  journal={MILCOM 2016 - 2016 IEEE Military Communications Conference},
  year={2016},
  pages={49-54}
}
Machine learning models are frequently used to solve complex security problems, as well as to make decisions in sensitive situations like guiding autonomous vehicles or predicting financial market behaviors. [] Key Result In a experiment, we show that adversaries can craft adversarial sequences misleading both categorical and sequential recurrent neural networks.

Figures from this paper

Detecting Adversarial Examples Using Data Manifolds
TLDR
The goal of finding limitations of the learning model presents a more tractable approach to protecting against adversarial attacks, based on identifying a low dimensional manifold in which the training samples lie and using the distance of a new observation from this manifold to identify whether this data point is adversarial or not.
Perturbation Analysis of Learning Algorithms: Generation of Adversarial Examples From Classification to Regression
TLDR
This work shows that adversarial examples can be generated using a generic approach that relies on the perturbation analysis of learning algorithms, and derives new attacks against classification algorithms which are shown to be top-performing on various architectures.
Adversarial Reprogramming of Sequence Classification Neural Networks
TLDR
This work introduces a context-based vocabulary remapping model to reprogram neural networks trained on a specific sequence classification task, for a new sequence Classification task desired by the adversary.
Defense Methods Against Adversarial Examples for Recurrent Neural Networks
TLDR
This paper presents a novel defense method, termed sequence squeezing, to make RNN classifiers more robust against adversarial attacks, and implements four additional RNN defense methods inspired by recently published CNN defense methods.
Perturbation Analysis of Learning Algorithms: A Unifying Perspective on Generation of Adversarial Examples
TLDR
A general framework based on the perturbation analysis of learning algorithms which consists of convex programming and is able to recover many current adversarial attacks as special cases is proposed and used to generate adversarial perturbations for regression tasks.
RTA3: A Real Time Adversarial Attack on Recurrent Neural Networks
TLDR
This paper successfully learns an attack policy to generate adversarial perturbations against the DeepSpeech ASR system and demonstrates that this attack policy generalizes to a set of unseen examples in real time.
Explainability and Adversarial Robustness for RNNs
TLDR
The Adversarial Robustness Score (ARS) is proposed for comparing IDSs and it is shown that an adversarial training procedure can significantly and successfully reduce the attack surface.
MagNet: A Two-Pronged Defense against Adversarial Examples
TLDR
MagNet, a framework for defending neural network classifiers against adversarial examples, is proposed and it is shown empirically that MagNet is effective against the most advanced state-of-the-art attacks in blackbox and graybox scenarios without sacrificing false positive rate on normal examples.
Feedback-Control Based Adversarial Attacks on Recurrent Neural Networks
TLDR
It is shown how control theory-based analysis tools can be leveraged to compute these adversarial input disturbances, and obtain bounds on how they impact the neural network performance.
Understanding Adversarial Examples Through Deep Neural Network's Response Surface and Uncertainty Regions
TLDR
This study reveals the structural problem of DNN classification boundary that leads to the adversarial examples and argues that generalization error, the large sample theoretical guarantee established for DNN, cannot adequately capture the phenomenon of adversarialExamples.
...
...

References

SHOWING 1-10 OF 30 REFERENCES
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
TLDR
The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN, and analytically investigates the generalizability and robustness properties granted by the use of defensive Distillation when training DNNs.
The Limitations of Deep Learning in Adversarial Settings
TLDR
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples
TLDR
This work introduces the first practical demonstration that cross-model transfer phenomenon enables attackers to control a remotely hosted DNN with no access to the model, its parameters, or its training data, and introduces the attack strategy of fitting a substitute model to the input-output pairs in this manner, then crafting adversarial examples based on this auxiliary model.
Practical Black-Box Attacks against Machine Learning
TLDR
This work introduces the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge, and finds that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.
Explaining and Harnessing Adversarial Examples
TLDR
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Machine Learning in Adversarial Settings
TLDR
The authors consider the underlying causes of adversarial samples and the future countermeasures that might mitigate them.
Malware classification with recurrent networks
TLDR
This work proposes a different approach, which, similar to natural language modeling, learns the language of malware spoken through the executed instructions and extracts robust, time domain features.
Intriguing properties of neural networks
TLDR
It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding
TLDR
The results show that on this task, both types of recurrent networks outperform the CRF baseline substantially, and a bi-directional Jordantype network that takes into account both past and future dependencies among slots works best, outperforming a CRFbased baseline by 14% in relative error reduction.
Large-scale malware classification using random projections and neural networks
TLDR
This work uses random projections to further reduce the dimensionality of the original input space and trains several very large-scale neural network systems with over 2.6 million labeled samples thereby achieving classification results with a two-class error rate of 0.49% for a single neural network and 0.42% for an ensemble of neural networks.
...
...