• Publications
  • Influence
Sound Event Detection in Domestic Environments with Weakly Labeled Data and Soundscape Synthesis
TLDR
The paper introduces Domestic Environment Sound Event Detection (DESED) dataset mixing a part of last year dataset and an additional synthetic, strongly labeled, dataset provided this year that’s described more in detail.
Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments
TLDR
This paper presents DCASE 2018 task 4.0, which evaluates systems for the large-scale detection of sound events using weakly labeled data (without time boundaries) and explores the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly labeling training set to improve system performance.
Sound Event Detection in Synthetic Domestic Environments
TLDR
A comparative analysis of the performance of state-of-the-art sound event detection systems based on the results of task 4 of the DCASE 2019 challenge, where submitted systems were evaluated on a series of synthetic soundscapes that allow us to carefully control for different soundscape characteristics.
Training Sound Event Detection on a Heterogeneous Dataset
TLDR
This work proposes to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes.
Improving Sound Event Detection in Domestic Environments using Sound Separation
TLDR
This paper starts from a sound separation model trained on the Free Universal Sound Separation dataset and the DCASE 2020 task 4 sound event detection baseline, and explores different methods to combine separated sound sources and the original mixture within the sound event Detection.
What’s all the Fuss about Free Universal Sound Separation Data?
TLDR
An open-source baseline separation model that can separate a variable number of sources in a mixture is introduced, based on an improved time-domain convolutional network (TDCN++), that achieves scale-invariant signal-to-noise ratio improvement (SI-SNRi) on mixtures with two to four sources.
Improving Sound Event Detection Metrics: Insights from DCASE 2020
TLDR
This paper compares conventional event-based and segment-based criteria against the Polyphonic Sound Detection Score (PSDS)'s intersection-based criterion, over a selection of systems from DCASE 2020 Challenge Task 4.
Sound Event Detection from Partially Annotated Data: Trends and Challenges
TLDR
A detailed analysis of the impact of the time segmentation, the event classification and the methods used to exploit unlabeled data on the final performance of sound event detection systems is proposed.
Semi-supervised Triplet Loss Based Learning of Ambient Audio Embeddings
TLDR
This paper combines unsupervised and supervised triplet loss based learning into a semi-supervised representation learning approach, whereby the positive samples for those triplets whose anchors are unlabeled are obtained either by applying a transformation to the anchor, or by selecting the nearest sample in the training set.
Sound Event Detection and Separation: A Benchmark on Desed Synthetic Soundscapes
TLDR
It is shown that temporal localization of sound events remains a challenge for SED systems and that reverberation and non-target sound events severely degrade system performance.
...
...