Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems

@article{Zhang2019DangerousSU,
  title={Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems},
  author={Nan Zhang and Xianghang Mi and Xuan Feng and Xiaofeng Wang and Yuan Tian and Feng Qian},
  journal={2019 IEEE Symposium on Security and Privacy (SP)},
  year={2019},
  pages={1381-1396}
}
Virtual personal assistants (VPA) (e.g., Amazon Alexa and Google Assistant) today mostly rely on the voice channel to communicate with their users, which however is known to be vulnerable, lacking proper authentication (from the user to the VPA). A new authentication challenge, from the VPA service to the user, has emerged with the rapid growth of the VPA ecosystem, which allows a third party to publish a function (called skill) for the service and therefore can be exploited to spread malicious… 

Figures and Tables from this paper

Security and Privacy Issues with Virtual Private Voice Assistants
A Survey about Virtual Private Voice Assistants (VPVA) such as Alexa, Google Assistant, Apple's Siri on about how users have adopted this technology, recent advancements, Privacy concerns, Security
MegaMind: a platform for security & privacy extensions for voice assistants
TLDR
MegaMind is a novel extensible platform that lets a user deploy security and privacy extensions locally on their voice assistant and protects against malicious extensions by providing two important guarantees, namely permission enforcement and non-interference.
On the Security and Privacy Challenges of Virtual Assistants
TLDR
There is a gap in the current state of the art in VA research, and no current literature reviews on the topic exist, which sheds light on future research directions, such as providing solutions to perform voice authentication without an external device, and the compliance of VAs with privacy regulations.
SkillFence: A Systems Approach to Practically Mitigating Voice-Based Confusion Attacks SkillFence: A Systems Approach to Practically A Systems Approach to Practically Confusion
TLDR
The work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process and builds SkillFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands.
ALEXA VERSUS ALEXA: Controlling Smart Speakers by Self-Issuing Voice Commands
TLDR
It is verified that, via AvA, attackers can control smart appliances within the household, buy unwanted items, tamper linked calendars and eavesdrop on the user, and most of the limitations against AvA are hardly used in practice.
What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning
TLDR
The serious privacy threats exposed by information leakage in a smart assistant's encrypted network traffic metadata are highlighted and the Smart Home Assistant Malicious Ensemble model (SHAME) is proposed as the new state-of-the-art Voice Command Fingerprinting classifier.
A Survey on Voice Assistant Security: Attacks and Countermeasures
TLDR
This article systematize a broad category of relevant but seemingly unrelated attacks by the vulnerable system components and attack methods, and categorize existing countermeasures based on the defensive strategies from a system designer’s perspective to assist designers in planning defense based on their demands.
"Alexa, stop spying on me!": speech privacy protection against voice assistants
TLDR
MicShield introduces a novel selective jamming mechanism, which obfuscates the user's private speech while passing legitimate voice commands to the VAs, and achieves this by using a phoneme level jamming control pipeline.
Private Data Harvesting on Alexa Using Third-Party Skills
TLDR
This work developed four malicious Skills for Alexa, the voice assistant developed by Amazon that harvest pieces of private information from users and proposes an adversarial framework that deceives users into disclosing private information.
Hey Alexa, Who Am I Talking to?: Analyzing Users’ Perception and Awareness Regarding Third-party Alexa Skills
TLDR
An interactive user study where participants listen to and interact with real-world skills using the official Alexa app finds that most participants fail to identify the skill developer correctly and cannot correctly determine which skills will be automatically activated through the voice interface.
...
...

References

SHOWING 1-10 OF 38 REFERENCES
AuDroid: Preventing Attacks on Audio Channels in Mobile Devices
TLDR
An extension to the SE Linux reference monitor integrated into the Android operating system for enforcing lattice security policies over the dynamically changing use of system audio resources, AuDroid shows that it is possible to prevent attacks using audio channels without compromising functionality or introducing significant performance overhead.
CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition
TLDR
Novel techniques are developed that address a key technical challenge: integrating the commands into a song in a way that can be effectively recognized by ASR through the air, in the presence of background noise, while not being detected by a human listener.
A11y Attacks: Exploiting Accessibility in Operating Systems
TLDR
This paper presents the first security evaluation of accessibility support for four of the most popular computing platforms: Microsoft Windows, Ubuntu Linux, iOS, and Android, and identifies twelve attacks that can bypass state-of-the-art defense mechanisms deployed on these OSs.
Smart Locks: Lessons for Securing Commodity Internet of Things Devices
TLDR
This work examines the security of home smart locks: cyber-physical devices that replace traditional door locks with deadbolts that can be electronically controlled by mobile devices or the lock manufacturer's remote servers and proposes several defenses that mitigate the attacks.
Hidden Voice Commands
TLDR
This paper explores in this paper how voice interfaces can be attacked with hidden voice commands that are unintelligible to human listeners but which are interpreted as commands by devices.
Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition
TLDR
It is found that differences in how humans and machines understand spoken speech can be easily exploited by an adversary to produce sound which is intelligible as a command to a computer speech recognition system but is not easily understandable by humans.
SmartAuth: User-Centered Authorization for the Internet of Things
TLDR
The technique, called SmartAuth, automatically collects security-relevant information from an IoT app’s description, code and annotations, and generates an authorization user interface to bridge the gap between the functionalities explained to the user and the operations the app actually performs.
Android UI Deception Revisited: Attacks and Defenses
TLDR
This work found that the solution proposed has a significant side channel vulnerability as well as susceptibility to clickjacking that allow non-privileged malware to completely compromise the defenses, and successfully steal passwords or other keyboard input.
Unleashing the Walking Dead: Understanding Cross-App Remote Infections on Mobile WebViews
TLDR
The contention between the demand for convenient cross-WebView communication and the need for security control on the channel is revealed, and the first step toward building OS-level protection to safeguard this fast-growing technology is made.
Peeking into Your App without Actually Seeing It: UI State Inference and Novel Android Attacks
TLDR
It is reported that on the Android system (and likely other OSes), a weaker form of GUI confidentiality can be breached in the form of UI state by a background app without requiring any permissions, leading to a class of attacks which are named UI state inference attack.
...
...