Characterizing the Effect of Audio Degradation on Privacy Perception And Inference Performance in Audio-Based Human Activity Recognition

  title={Characterizing the Effect of Audio Degradation on Privacy Perception And Inference Performance in Audio-Based Human Activity Recognition},
  author={Dawei Liang and Wenting Song and Edison Thomaz},
  journal={22nd International Conference on Human-Computer Interaction with Mobile Devices and Services},
Audio has been increasingly adopted as a sensing modality in a variety of human-centered mobile applications and in smart assistants in the home. Although acoustic features can capture complex semantic information about human activities and context, continuous audio recording often poses significant privacy concerns. An intuitive way to reduce privacy concerns is to degrade audio quality such that speech and other relevant acoustic markers become unintelligible, but this often comes at the cost… 

Figures and Tables from this paper

Theophany: Multimodal Speech Augmentation in Instantaneous Privacy Channels
Theophany is introduced, a privacy-preserving framework for augmenting speech that develops the first privacy perception model that assesses the privacy risk of a face-to-face conversation based on its topic, location, and participants.
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
A dual-branch neural network architecture is developed for the joint learning of voice and acoustic features during an AED process and thorough empirical studies are conducted to examine the performance on the public AudioSet with different types of inputs.


ESC: Dataset for Environmental Sound Classification
A new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project are presented.
Sound Privacy: A Conversational Speech Corpus for Quantifying the Experience of Privacy
A database which quantifies the experience of privacy users have in spoken communication is presented, which enables studies in how acoustic environments affect peoples’ experience ofPrivacy, which can be used to develop speech operated applications which are respectful of their right to privacy.
Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos
A framework for audio-based activity recognition that can make use of millions of embedding features from public online video sound clips is proposed, based on the combination of oversampling and deep learning approaches, that does not require further feature processing or outliers filtering.
A Generative Model for Speech Segmentation and Obfuscation for Remote Health Monitoring
This paper presents a novel speech privacy preservation methodology using generative adversarial networks to segment human speech in a recorded audio and generate human-like random speech to replace the original segment.
GestEar: combining audio and motion sensing for gesture recognition on smartwatches
A lightweight convolutional neural network architecture for gesture recognition, specifically designed to run locally on resource-constrained devices, which achieves a user-independent recognition accuracy of 97.2% for nine distinct gestures.
Music, Search, and IoT
How VAs are used in the home, the role of VAs as scaffolding for Internet of Things device control, and emergent issues of privacy for VA users are investigated and characterized.
Privacy-Preserving Variational Information Feature Extraction for Domestic Activity Monitoring versus Speaker Identification
It is empirically demonstrated that the proposed method reduces speaker identification privacy risks without significantly deprecating the performance of domestic activity monitoring tasks.
Audio Event Recognition in the Smart Home
Three aspects of the productization of AER are reviewed, including a review of the key notions underpinning European data and privacy protection laws, which suggest a set of guidelines which summarize into empowering users to consent by fully informing them about the use of their data.
Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes
This work describes a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data and proposes methods to learn representations using this model which can be effectively used for solving the target task.