• Corpus ID: 221094134

JOINT ACOUSTIC AND SUPERVISED INFERENCE FOR SOUND EVENT DETECTION Technical Report

@inproceedings{Park2020JOINTAA,
  title={JOINT ACOUSTIC AND SUPERVISED INFERENCE FOR SOUND EVENT DETECTION Technical Report},
  author={Sangwook Park and Ashwin Bellur and Sandeep Reddy Kothinti and Masoumeh Heidari Kapourchali and Mounya Elhilali},
  year={2020}
}
This is a technical report about a sound event detection system for the task 4 of DCASE2020. The purpose of a sound event detection is to find event class label as well as its time boundaries. To achieve this purpose, we considered several methods such signal enhancement and event boundary detection, and built five systems by integrating these methods with supervised system trained by using Mean Teacher model. In particular, we estimate event boundaries of weakly labeled data by performing a… 

Figures from this paper

References

SHOWING 1-10 OF 12 REFERENCES
Joint Acoustic and Class Inference for Weakly Supervised Sound Event Detection
TLDR
This work presents a hybrid approach that combines an acoustic-driven event boundary detection and a supervised label inference using a deep neural network that leverages benefits of both unsupervised and supervised methodologies and takes advantage of large amounts of unlabeled data, making it ideal for large-scale weakly la-beled event detection.
Sound Event Detection in Domestic Environments with Weakly Labeled Data and Soundscape Synthesis
TLDR
The paper introduces Domestic Environment Sound Event Detection (DESED) dataset mixing a part of last year dataset and an additional synthetic, strongly labeled, dataset provided this year that’s described more in detail.
Score Fusion of Classification Systems for Acoustic Scene Classification
TLDR
This study explores several methods in three aspects; feature extraction, generative/discriminative machine learning, and score fusion for final decision on the acoustic scene classification task of the IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events.
Sound Event Detection in Synthetic Domestic Environments
TLDR
A comparative analysis of the performance of state-of-the-art sound event detection systems based on the results of task 4 of the DCASE 2019 challenge, where submitted systems were evaluated on a series of synthetic soundscapes that allow us to carefully control for different soundscape characteristics.
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
TLDR
The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.
Auto-Encoding Variational Bayes
TLDR
A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Bio-Mimetic Attentional Feedback in Music Source Separation
  • Ashwin Bellur, M. Elhilali
  • Psychology
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
TLDR
These competing theories of attentional feedback complement each other and yield state of the art performance in music source separation, and it is shown that systems with attentional mechanisms can be made to scale to mismatched conditions by retuning only the attentional modules with minimal data.
Auditory salience using natural soundscapes
TLDR
The study explores auditory salience in a set of dynamic natural scenes and indicates that contextual information about the entire scene over both short and long scales needs to be considered in order to properly account for perceptual judgments of salience.
Freesound technical demo
TLDR
This demo wants to introduce Freesound to the multimedia community and show its potential as a research resource.
Linear predictive coding
The basic principles of linear predictive coding (LPC) are presented. Least-squares methods for obtaining the LPC coefficients characterizing the all-pole filter are described. Computational factors,
...
1
2
...