• Corpus ID: 15159296

Sound Event Detection in Multisource Environments Using Source Separation

@inproceedings{Heittola2011SoundED,
  title={Sound Event Detection in Multisource Environments Using Source Separation},
  author={Toni Heittola and Annamaria Mesaros and Tuomas Virtanen and Antti J. Eronen},
  year={2011}
}
This paper proposes a sound event detection system for natural multisource environments, using a sound source separation front-end. The recognizer aims at detecting sound events from various everyday contexts. The audio is preprocessed using non-negative matrix factorization and separated into four individual signals. Each sound event class is represented by a Hidden Markov Model trained using mel frequency cepstral coefficients extracted from the audio. Each separated signal is used… 

Figures and Tables from this paper

Minimally supervised sound event detection using a neural network
TLDR
A sound event detection system that is trained using a minimally annotated data set of single sounds to identify and separate components of polyphonic sounds and is able to achieve reasonable accuracy of source separation and detection with minimal training set.
A Joint Separation-Classification Model for Sound Event Detection of Weakly Labelled Data
TLDR
A joint separation-classification model trained only on weakly labelled audio data, that is, only the tags of an audio recording are known but the time of the events are unknown is proposed, outperforming deep neural network baseline of 0.29.
Robust Sound Event Detection Through Noise Estimation and Source Separation Using NMF
TLDR
The proposed method is based on supervised non-negative matrix factorization (NMF) for separating target events from noise and can produce accurate source separation results by reducing noise residue and signal distortion of the reconstructed event spectrogram.
Supervised model training for overlapping sound events based on unsupervised source separation
TLDR
Two iterative approaches based on EM algorithm to select the most likely stream to contain the target sound to give a reasonable increase of 8 percentage units in the detection accuracy are proposed.
Enhanced local feature approach for overlapping sound event recognition
  • J. DennisT. H. Dat
  • Computer Science
    Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific
  • 2014
TLDR
A feature-based approach to address the challenging task of recognising overlapping sound events from single channel audio by taking the output from the GHT and using it as a feature for classification, and demonstrating that such an approach can improve upon the previous knowledge-based scoring system.
Continuous robust sound event classification using time-frequency features and deep learning
TLDR
This paper proposes and evaluates a novel Bayesian-inspired front end for the segmentation and detection of continuous sound recordings prior to classification, and benchmarks several high performing isolated sound classifiers to operate with continuous sound data by incorporating an energy-based event detection front end.
Sound event detection in real-life audio using joint spectral and temporal features
TLDR
A new approach for SED in real-life audio using Nonnegative Matrix Factor 2-D Deconvolution and RUSBoost techniques to capture the two-dimensional joint spectral and temporal information from the time-frequency representation while possibly separating the sound mixture into several sources.
Early Detection of Continuous and Partial Audio Events Using CNN
TLDR
A proven CNN classifier acting on spectrogram image features, with time-frequency shaped energy detection that identifies seed regions within the spectrogram that are characteristic of auditory energy events is combined to allow early detection of events as they are developing.
Context-dependent sound event detection
TLDR
The two-step approach was found to improve the results substantially compared to the context-independent baseline system, and the detection accuracy can be almost doubled by using the proposed context-dependent event detection.
...
...

References

SHOWING 1-10 OF 18 REFERENCES
Events Detection for an Audio-Based Surveillance System
TLDR
The automatic shot detection system presented is based on a novelty detection approach which offers a solution to detect abnormality (abnormal audio events) in continuous audio recordings of public places and takes advantage of potential similarity between the acoustic signatures of the different types of weapons by building a hierarchical classification system.
Acoustic event detection in real life recordings
TLDR
A system for acoustic event detection in recordings from real life environments using a network of hidden Markov models, capable of recognizing almost one third of the events, and the temporal positioning of the Events is not correct for 84% of the time.
Musical Instrument Recognition in Polyphonic Audio Using Source-Filter Model for Sound Separation
This paper proposes a novel approach to musical instrument recognition in polyphonic audio signals by using a source-filter model and an augmented non-negative matrix factorization algorithm for
Drum transcription with non-negative spectrogram factorisation
TLDR
A novel method based on separating the target drum sounds from the input signal using non-negative matrix factorisation, and on detecting sound onsets from the separated signals, gave better results than two state-of-the-art methods in simulations with acoustic signals containing polyphonic drum sequences.
Environmental Sound Recognition With Time–Frequency Audio Features
TLDR
An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.
Recognizing speech from simultaneous speakers
TLDR
This paper presents and evaluates factored methods for recognition of simultaneous speech from multiple speakers in single-channel recordings using an NMF-based speaker separation algorithm that generates separated spectra for each speakers, and a mask estimation method that generates spectral masks for each speaker.
Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods
TLDR
It was found that vocal line separation enables robust singer identification down to 0dB and -5dB singer-to-accompaniment ratios.
Automatic surveillance of the acoustic activity in our living environment
TLDR
An acoustic surveillance system comprised of a computer and microphone situated in a typical office environment that continuously analyzes the acoustic activity at the recording site, separates all interesting events, and stores them in a database is reported.
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
  • T. Virtanen
  • Computer Science
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2007
TLDR
An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented and enables a better separation quality than the previous algorithms.
Audio-based context recognition
TLDR
This paper investigates the feasibility of an audio-based context recognition system developed and compared to the accuracy of human listeners in the same task, with particular emphasis on the computational complexity of the methods.
...
...