Your Echos are Heard: Tracking, Profiling, and Ad Targeting in the Amazon Smart Speaker Ecosystem

  title={Your Echos are Heard: Tracking, Profiling, and Ad Targeting in the Amazon Smart Speaker Ecosystem},
  author={Umar Iqbal and Pouneh Nikkhah Bahrami and Rahmadi Trimananda and Hao Cui and Alexander Gamero-Garrido and Daniel Dubois and David R. Choffnes and Athina Markopoulou and Franziska Roesner and Zubair Shafiq},
—Smart speakers collect voice input that can be used to infer sensitive information about users. Given a number of egregious privacy breaches, there is a clear unmet need for greater transparency and control over data collection, sharing, and use by smart speaker platforms as well as third party skills supported on them. To bridge the gap, we build an auditing framework that leverages online advertising to measure data collection, its usage, and its sharing by the smart speaker platforms. We… 

Challenges in inferring privacy properties of smart devices: towards scalable multi-vantage point testing methods

This paper explores and discusses open research challenges in this complex domain and sketches a roadmap to build scalable methods for smart home privacy testing.

Surveillance Capitalism or Democracy? The Death Match of Institutional Orders and the Politics of Knowledge in Our Information Civilization

Surveillance capitalism is what happened when US democracy stood down. Two decades later, it fails any reasonable test of responsible global stewardship of digital information and communications. The

VoiceBlock: Privacy through Real-Time Adversarial Attacks with Audio-to-Audio Models

This work proposes a neural network model capable of adversarially modifying a user’s audio stream in real-time and demonstrates its model is highly effective at de-identifying user speech from speaker recognition and able to transfer to an unseen recognition system.

PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs

This paper view and analyze, for the first time, the entire text of a privacy policy in an integrated way, and revisit the notion of ontologies, previously defined in heuristic ways, to capture subsumption relations between terms.

A CI-based Auditing Framework for Data Collection Practices

It is argued that the contextual integrity (CI) tuple can be the basic building block for defining and implementing such an auditing framework, and elaborate on the special case where the tuple is partially extracted from the network traffic generated by the end-device of interest, and partially from the corresponding privacy policies using natural language processing (NLP) techniques.

Protecting Health Privacy through Reasonable Inferences

This paper presents a meta-answering of the dimensions of agency in the digital age that aims to clarify the role of informed consent in the context of data mining.

Locally Authenticated Privacy-preserving Voice Input

A secure, flexible privacy preserving system to capture and store an on-device fingerprint of the users’ raw signals for authentication instead of sending/sharing the raw biometric signals and fuse multiple predictors’ decisions to make a final decision on whether the user input is legitimate or not.



Actions Speak Louder than Words: Entity-Sensitive Privacy Policy and Data Flow Analysis with PoliCheck

By defining a novel automated, entity-sensitive flow-to-policy consistency analysis, POLICHECK provides the highest-precision method to date to determine if applications properly disclose their privacy-sensitive behaviors.

Online Tracking: A 1million-site Measurement and Analysis

  • In ACM Conference on Computer and Communications Security (CCS)
  • 2016

If you are not paying for it, you are the product: how much do advertisers pay to reach you?

This study develops a first of its kind methodology for computing exactly that - the price paid for a web user by the ad ecosystem - and it can estimate a user's advertising worth with more than 82% accuracy.

What Can You Hear? And What Will You Do With It? amazon-google-privacy-digital-assistants.html

  • 2018

SkillDetective: Automated Policy-Violation Detection of Voice Assistant Applications in the Wild

This work designs and develops S KILL D ETECTIVE, an interactive testing tool capable of exploring voice-apps’ behaviors and identifying policy violations in an automated manner, and evaluates voice- apps’ conformity to 52 different policy requirements in a broader context from multiple sources including textual, image and audio fles.

Khaleesi: Breaker of Advertising and Tracking Request Chains

It is shown that K HALEESI achieves high accuracy, that holds well over time, is generally robust against evasion attempts, and outperforms existing approaches, and it is suitable for online deployment and it improves page load performance.

Auditing Network Traffic and Privacy Policies in Oculus VR

Compared to the mobile and other app ecosystems, OVR is found to be more centralized and driven by tracking and analytics, rather than by third-party advertising.

Blocking Without Breaking: Identification and Mitigation of Non-Essential IoT Traffic

This paper develops a rigorous methodology that relies on automated IoT-device experimentation to reveal which network connections are essential, and which are not, and proposes a set of guidelines for automatically limiting non-essential IoT traffic.

SkillVet: Automated Traceability Analysis of Amazon Alexa Skills

This article presents the largest systematic measurement of the Amazon Alexa skill ecosystem to date, and studies developers’ practices in this ecosystem, including how they collect and justify the need for sensitive information, by designing a methodology to identify over-privileged skills with broken privacy policies.