DolphinAttack: Inaudible Voice Commands

@article{Zhang2017DolphinAttackIV,
  title={DolphinAttack: Inaudible Voice Commands},
  author={Guoming Zhang and Chen Yan and Xiaoyu Ji and Tianchen Zhang and Taimin Zhang and Wenyuan Xu},
  journal={Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security},
  year={2017}
}
  • Guoming Zhang, Chen Yan, +3 authors W. Xu
  • Published 2017
  • Computer Science
  • Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
Speech recognition (SR) systems such as Siri or Google Now have become an increasingly popular human-computer interaction method, and have turned various systems into voice controllable systems (VCS). Prior work on attacking VCS shows that the hidden voice commands that are incomprehensible to people can control the systems. Hidden voice commands, though "hidden", are nonetheless audible. In this work, we design a totally inaudible attack, DolphinAttack, that modulates voice commands on… Expand
The Feasibility of Injecting Inaudible Voice Commands to Voice Assistants
TLDR
A completely inaudible attack, DolphinAttack, that modulates voice commands on ultrasonic carriers to achieve inaudibility and is validated that it is feasible to detect DolphinAttack by classifying the audios using supported vector machine (SVM), and suggests to re-design voice assistants to be resilient to inaudibles voice command attacks. Expand
CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition
TLDR
Novel techniques are developed that address a key technical challenge: integrating the commands into a song in a way that can be effectively recognized by ASR through the air, in the presence of background noise, while not being detected by a human listener. Expand
Canceling Inaudible Voice Commands Against Voice Control Systems
TLDR
An active inaudible-voice-command cancellation (AIC) design is presented, which can reliably detect and capture the attack signal facilitated by the custom-designed "guard'' signal transmitter and create a special spectrum in the passband of the VCS microphone, based on which it is able to neutralize the attacked signal in software means. Expand
Inaudible Voice Commands: The Long-Range Attack and Defense
TLDR
The core ideas emerge from a careful forensics on voice, i.e., finding indelible traces of nonlinearity in recorded voice signals, followed by defenses that only require software changes to the microphone. Expand
Attacking Speaker Recognition Systems with Phoneme Morphing
TLDR
This paper presents a method that transforms the voice of one person so that it resembles thevoice of a victim, such that it can be used to deceive speaker recognition systems into believing an utterance was spoken by the victim. Expand
SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves
TLDR
A new attack called SurfingAttack is designed that would enable multiple rounds of interactions between the voice-controlled device and the attacker over a longer distance and without the need to be in line-of-sight, and enables new attack scenarios, such as hijacking a mobile Short Message Service passcode, making ghost fraud calls without owners’ knowledge, etc. Expand
Defeating hidden audio channel attacks on voice assistants via audio-induced surface vibrations
TLDR
The system is based on the premise that while the crafted audio features of the hidden voice commands may fool an authentication system in the audio domain, their unique audio-induced surface vibrations captured by the motion sensor are hard to forge. Expand
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems
TLDR
To the best of the knowledge, this is the first publicly available database that has been specifically designed for the protection of state-of-the-art voice-controlled systems against various replay attacks in various conditions and environments. Expand
Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems
TLDR
This paper exploits the fact that multiple source audio samples have similar feature vectors when transformed by acoustic feature extraction algorithms to exploit knowledge of the signal processing algorithms commonly used by VPSes to generate the data fed into machine learning systems. Expand
When the Differences in Frequency Domain are Compensated: Understanding and Defeating Modulated Replay Attacks on Automatic Speech Recognition
TLDR
It is discovered and formally prove that no matter how the replay audio signals could be modulated, the replay attacks will either leave ringing artifacts in the time domain or cause spectrum distortion in the frequency domain. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 58 REFERENCES
Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition
TLDR
It is found that differences in how humans and machines understand spoken speech can be easily exploited by an adversary to produce sound which is intelligible as a command to a computer speech recognition system but is not easily understandable by humans. Expand
Hidden Voice Commands
TLDR
This paper explores in this paper how voice interfaces can be attacked with hidden voice commands that are unintelligible to human listeners but which are interpreted as commands by devices. Expand
BackDoor: Making Microphones Hear Inaudible Sounds
TLDR
BackDoor, a system that develops the technical building blocks for harnessing non-linearities in microphone hardware and achieves upwards of 4kbps for proximate data communication, as well as room-level privacy protection against electronic eavesdropping. Expand
Speech Recognition using MFCC
TLDR
This paper describes an approach of speech recognition by using the Mel-Scale Frequency Cepstral Coefficients (MFCC) extracted from speech signal of spoken words, and shows the improvement in recognition rates significantly when training the SVM with more MFCC samples by randomly selected from database. Expand
Cepstral domain segmental feature vector normalization for noise robust speech recognition
TLDR
A segmental feature vector normalization technique is proposed which makes an automatic speech recognition system more robust to environmental changes by normalizing the output of the signal-processing front-end to have similar segmental parameter statistics in all noise conditions. Expand
All Your Voices are Belong to Us: Stealing Voices to Fool Humans and Machines
TLDR
The attacker uses voice morphing techniques to transform its voice – speaking any arbitrary message – into the victim’s voice and examines the aftermaths of such a voice impersonation capability against two important applications and contexts. Expand
Gyrophone: Recognizing Speech from Gyroscope Signals
TLDR
It is shown that the MEMS gyroscopes found on modern smart phones are sufficiently sensitive to measure acoustic signals in the vicinity of the phone and that this information is sufficient to identify speaker information and even parse speech. Expand
Acoustic Side-Channel Attacks on Printers
TLDR
A novel attack is presented that recovers what a dot-matrix printer processing English text is printing based on a record of the sound it makes, if the microphone is close enough to the printer. Expand
Your Voice Assistant is Mine: How to Abuse Speakers to Steal Information and Control Your Phone
TLDR
A novel approach (GVS-Attack) to launch permission bypassing attacks from a zero-permission Android application (VoicEmployer) through the phone speaker, which can forge SMS/Email, access privacy information, transmit sensitive data and achieve remote control without any permission. Expand
Soundcomber: A Stealthy and Context-Aware Sound Trojan for Smartphones
TLDR
This work presents Soundcomber, a Trojan with few and innocuous permissions, that can extract a small amount of targeted private information from the audio sensor of the phone, and performs efficient, stealthy local extraction, thereby greatly reducing the communication cost for delivering stolen data. Expand
...
1
2
3
4
5
...