Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

@article{Carlini2018AudioAE,
  title={Audio Adversarial Examples: Targeted Attacks on Speech-to-Text},
  author={Nicholas Carlini and David A. Wagner},
  journal={2018 IEEE Security and Privacy Workshops (SPW)},
  year={2018},
  pages={1-7}
}
We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples. 

Figures from this paper

Audio Adversarial Examples: Attacks Using Vocal Masks
TLDR
The feasibility of this audio adversarial attack introduces a new domain to study machine and human perception of speech and shows that these adversarial examples fool State-Of-The-Art Speech-To-Text systems, yet humans are able to consistently pick out the speech.
Towards Mitigating Audio Adversarial Perturbations
TLDR
This work aims to explore the robustness of these audio adversarial examples generated via two attack strategies by applying different signal processing methods to recover the original audio sequence, and shows that by inspecting the temporal consistency in speech signals, it can potentially identify non-adaptive audio adversaries.
Adversarial attack on Speech-to-Text Recognition Models
TLDR
This paper introduces the first study of weighted-sampling audio adversarial examples, specifically focusing on the factor of the numbers and the positions of distortion to reduce the search space, and proposes a new attack scenario, audio injection attack, which offers some novel insights in the concealment of adversarial attack.
EvolMusic: towards musical adversarial examples for black-box attacks on speech-to-text
TLDR
EvolMusic is presented, the first targeted adversarial attack based on musical note-sequences, generated via an adaptive evolutionary approach in a black-box setting and evaluated against DeepSpeech v0.9.1 using the Fluent Speech Commands dataset.
Towards Robust Speech-to-Text Adversarial Attack
This paper introduces a novel adversarial algorithm for attacking the state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo. Our approach is based on developing an extension
Robust Audio Adversarial Example for a Physical Attack
TLDR
Evaluation and a listening experiment demonstrated that adversarial examples generated by the proposed method are able to attack a state-of-the-art speech recognition model in the physical world without being noticed by humans, suggesting that audio adversarial example may become a real threat.
GENERATING ROBUST AUDIO ADVERSARIAL EXAM-
TLDR
A new approach to generate adversarial audios using Iterative Proportional Clipping (IPC), which exploits temporal dependency in original audios to significantly limit human-perceptible noise and can bypass temporal dependency based defense mechanisms.
Targeted Adversarial Examples for Black Box Audio Systems
TLDR
This paper adopts a black-box approach to adversarial generation, combining the approaches of both genetic algorithms and gradient estimation to solve the ASR fooling task.
Robust Over-the-Air Adversarial Examples Against Automatic Speech Recognition Systems
TLDR
This paper demonstrates the first generic algorithm that produces adversarial examples which remain robust in an over-the-air attack such that the ASR system transcribes the target transcription after actually being replayed.
Defending and Detecting Audio Adversarial Example using Frame Offsets
TLDR
By analyzing the characteristics of ASR systems, it is found that frame offsets with silence clip appended at the beginning of an audio can degenerate adversarial perturbations to normal noise and can defense audio adversarial example in a simpler, more generic and efficient way.
...
...

References

SHOWING 1-10 OF 52 REFERENCES
Crafting Adversarial Examples For Speech Paralinguistics Applications
TLDR
This work proposes a novel end-to-end scheme to generate adversarial examples by perturbing directly the raw waveform of an audio recording rather than specific acoustic features, which can lead to a significant performance drop of state-of-the-art deep neural networks.
Deep Learning and Music Adversaries
TLDR
This work builds adversaries for deep learning systems applied to image object recognition by exploiting the parameters of the system to find the minimal perturbation of the input image such that the system misclassifies it with high confidence.
Houdini: Fooling Deep Structured Prediction Models
TLDR
This work introduces a novel flexible approach named Houdini for generating adversarial examples specifically tailored for the final performance measure of the task considered, be it combinatorial and non-decomposable.
Adversarial Examples for Generative Models
TLDR
This work explores methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN and presents three classes of attacks, motivating why an attacker might be interested in deploying such techniques against a target generative network.
Synthesizing Robust Adversarial Examples
TLDR
The existence of robust 3D adversarial objects is demonstrated, and the first algorithm for synthesizing examples that are adversarial over a chosen distribution of transformations is presented, which synthesizes two-dimensional adversarial images that are robust to noise, distortion, and affine transformation.
Adversarial Diversity and Hard Positive Generation
TLDR
A new psychometric perceptual adversarial similarity score (PASS) measure for quantifying adversarial images, the notion of hard positive generation is introduced, and a novel hot/cold approach for adversarial example generation is presented, which provides multiple possible adversarial perturbations for every single image.
Towards Evaluating the Robustness of Neural Networks
TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.
Delving into Transferable Adversarial Examples and Black-box Attacks
TLDR
This work is the first to conduct an extensive study of the transferability over large models and a large scale dataset, and it is also theFirst to study the transferabilities of targeted adversarial examples with their target labels.
Adversarial Perturbations Against Deep Neural Networks for Malware Classification
TLDR
This paper shows how to construct highly-effective adversarial sample crafting attacks for neural networks used as malware classifiers, and evaluates to which extent potential defensive mechanisms against adversarial crafting can be leveraged to the setting of malware classification.
Query-Efficient Black-box Adversarial Examples
TLDR
A new method for reliably generating adversarial examples under more restricted, practical black-box threat models and a new algorithm to perform targeted adversarial attacks in the partial-information setting, where the attacker only has access to a limited number of target classes.
...
...