Recognizing Daily Life Context Using Web-Collected Audio Data

@article{Rossi2012RecognizingDL,
  title={Recognizing Daily Life Context Using Web-Collected Audio Data},
  author={Mirco Rossi and Gerhard Tr{\"o}ster and Oliver Amft},
  journal={2012 16th International Symposium on Wearable Computers},
  year={2012},
  pages={25-28}
}
This work presents an approach to model daily life contexts from web-collected audio data. Being available in vast quantities from many different sources, audio data from the web provides heterogeneous training data to construct recognition systems. Crowd-sourced textual descriptions (tags) related to individual sound samples were used in a configurable recognition system to model 23 sound context categories. We analysed our approach using different outlier filtering techniques with dedicated… 

Figures from this paper

Combining crowd-generated media and personal data: semi-supervised learning for context recognition
TLDR
This work uses a semi-supervised Gaussian mixture model to combine labeled data from the crowd-generated database and unlabeled personal recording data to train a personalized model for context recognition of users' mobile phones.
Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos
TLDR
A framework for audio-based activity recognition that can make use of millions of embedding features from public online video sound clips is proposed, based on the combination of oversampling and deep learning approaches, that does not require further feature processing or outliers filtering.
Recognizing Detailed Human Context in the Wild from Smartphones and Smartwatches
TLDR
The authors demonstrate how fusion of multimodal sensors is important for resolving situations that were harder to recognize and present a baseline system and encourage researchers to use their public dataset to compare methods and improve context recognition in the wild.
Towards scalable activity recognition: adapting zero-effort crowdsourced acoustic models
TLDR
This work investigates two adapting approaches: a semi-supervised learning to combine crowd-sourced data and unlabeled user data, and an active-learning to query the user for labeling samples where the crowd- sourced based model fails to recognize.
Wearable sound-based recognition of daily life activities, locations and conversations
TLDR
A personal wearable sound based recognition system which provides continuous real-time context information of the user throughout his day is envisioned, focusing on two categories of context: speaker sensing and ambient sound sensing.
Low-Power Ambient Sensing in Smartphones for Continuous Semantic Localization
TLDR
Low-power ambient sensors are proposed to be integrated in phones to enable a continuous observation with minimal impact on power consumption, and achieve up to 80% accuracy for recognition of five location categories in a user-specific setting, while saving up to 85% of the battery power consumed by traditional sensing modalities.
4th workshop on human activity sensing corpus and applications: towards open-ended context awareness
TLDR
This workshop deals with the challenges of designing reproducible experimental setups, running large-scale dataset collection campaigns, designing robust activity and context recognition methods and evaluating systems in the real world.
5th Int. workshop on human activity sensing corpus and applications (HASCA): towards open-ended context awareness
TLDR
This workshop deals with the challenges of designing reproducible experimental setups, running large-scale dataset collection campaigns, designing activity and context recognition methods that are robust and adaptive, and evaluating systems in the real world.
Using unlabeled acoustic data with locality-constrained linear coding for energy-related activity recognition in buildings
TLDR
The proposed method applies the locality-constrained linear coding to process the labeled and unlabeled samples in order to achieve an acceptable classification accuracy as compared with traditional supervised learning approaches that purely rely on the large number of expensive annotations.
Inference of Conversation Partners by Cooperative Acoustic Sensing in Smartphone Networks
TLDR
This work considers the inference of conversation partners via acoustic sensing conducted by a group of smartphones in vicinity by considering the continuity and overlap of speeches and proposes novel inference methods to identify conversational relationships among co-located users.
...
1
2
3
...

References

SHOWING 1-7 OF 7 REFERENCES
Audio-based context recognition
TLDR
This paper investigates the feasibility of an audio-based context recognition system developed and compared to the accuracy of human listeners in the same task, with particular emphasis on the computational complexity of the methods.
SoundSense: scalable sound sensing for people-centric applications on mobile phones
TLDR
This paper proposes SoundSense, a scalable framework for modeling sound events on mobile phones that represents the first general purpose sound sensing system specifically designed to work on resource limited phones and demonstrates that SoundSense is capable of recognizing meaningful sound events that occur in users' everyday lives.
Mining models of human activities from the web
TLDR
A new class of sensors, based on Radio Frequency Identification (RFID) tags, can directly yield semantic terms that describe the state of the physical world, and is shown how to mine definitions of activities in an unsupervised manner from the web.
Large-scale content-based audio retrieval from text queries
TLDR
A machine learning approach for retrieving sounds that is novel in that it uses free-form text queries rather sound sample based queries, searches by audio content rather than via textual meta data, and can scale to very large number of audio documents and very rich query vocabulary.
Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach
TLDR
This paper investigates and compares methods that learn image classifiers by combining very few manually annotated examples and a large number of weakly-labeled Web photos retrieved using keyword-based image search, and finds that these classifiers are one order of magnitude faster to learn and to evaluate than the best competing method.
Audio Query by Example Using Similarity Measures between Probability Density Functions of Features
TLDR
A query by example system for generic audio is proposed that estimates the similarity of the example signal and the samples in the queried database by calculating the distance between the probability density functions of their frame-wise acoustic features.
SoundButton: design of a low power wearable audio classification system
TLDR
The paper deals with the design of a sound recognitions system focused on an ultra low power hardware implementation in a button like miniature form factor and presents the VHDL model of the hardware showing that the method can be implemented with minimal resources.