The impact of non-target events in synthetic soundscapes for sound event detection

@inproceedings{Ronchini2021TheIO,
  title={The impact of non-target events in synthetic soundscapes for sound event detection},
  author={Francesca Ronchini and Romain Serizel and Nicolas Turpault and Samuele Cornell},
  booktitle={Workshop on Detection and Classification of Acoustic Scenes and Events},
  year={2021}
}
Detection and Classification Acoustic Scene and Events Challenge 2021 Task 4 uses a heterogeneous dataset that includes both recorded and synthetic soundscapes. Until recently only target sound events were considered when synthesizing the soundscapes. However, recorded soundscapes often contain a substantial amount of non-target events that may affect the performance. In this paper, we focus on the impact of these non-target events in the synthetic soundscapes. Firstly, we investigate to what… 

Tables from this paper

A Benchmark of State-of-the-Art Sound Event Detection Systems Evaluated on Synthetic Soundscapes

  • Francesca RonchiniR. Serizel
  • Computer Science
    ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2022
A benchmark of submissions to Detection and Classification Acoustic Scene and Events 2021 Challenge (DCASE) Task 4 representing a sampling of the state-of-the-art in Sound Event Detection task is proposed and results show that systems adapted to provide coarse segmentation outputs are more robust to different target to non-target signal-to-noise ratio and to time localization of the original event.

Description and analysis of novelties introduced in DCASE Task 4 2022 on the baseline system

Three main novelties were introduced: the use of external datasets, including recently released strongly annotated clips from Audioset, the possibility of leveraging pre-trained models, and a new energy consumption metric to raise awareness about the ecological impact of training sound events detectors.

Threshold Independent Evaluation of Sound Event Detection Scores

A method which allows for computing system performance on an evaluation set for all possible thresholds jointly, enabling accurate computation not only of the PSD-ROC and PSDS but also of other collar-based and intersection-based performance curves.

AN EFFECTIVE CONSISTENCY REGULARIZATION TRAINING BASED MEAN TEACHER METHOD FOR SOUND EVENT DETECTION Technical Report

This technical report describes the system, which was submitted to DCASE2021 Task4: Sound Event Detection in Domestic Environments, and proposes to add an auxiliary branch to the CRNN network to improve the detection and classification ability of theCRNN model.

References

SHOWING 1-10 OF 21 REFERENCES

Sound Event Detection in Domestic Environments with Weakly Labeled Data and Soundscape Synthesis

The paper introduces Domestic Environment Sound Event Detection (DESED) dataset mixing a part of last year dataset and an additional synthetic, strongly labeled, dataset provided this year that’s described more in detail.

Sound Event Detection in Synthetic Domestic Environments

A comparative analysis of the performance of state-of-the-art sound event detection systems based on the results of task 4 of the DCASE 2019 challenge, where submitted systems were evaluated on a series of synthetic soundscapes that allow us to carefully control for different soundscape characteristics.

Sound Event Detection from Partially Annotated Data: Trends and Challenges

A detailed analysis of the impact of the time segmentation, the event classification and the methods used to exploit unlabeled data on the final performance of sound event detection systems is proposed.

A Framework for the Robust Evaluation of Sound Event Detection

A new framework for performance evaluation of polyphonic sound event detection (SED) systems is defined, which overcomes the limitations of the conventional collar-based event decisions, event F-scores and event error rates and introduces a definition of event detection that is more robust against labelling subjectivity.

Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments

This paper presents DCASE 2018 task 4.0, which evaluates systems for the large-scale detection of sound events using weakly labeled data (without time boundaries) and explores the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly labeling training set to improve system performance.

TUT database for acoustic scene classification and sound event detection

The recording and annotation procedure, the database content, a recommended cross-validation setup and performance of supervised acoustic scene classification system and event detection baseline system using mel frequency cepstral coefficients and Gaussian mixture models are presented.

Scaper: A library for soundscape synthesis and augmentation

Given a collection of iso-lated sound events, Scaper acts as a high-level sequencer that can generate multiple soundscapes from a single, probabilistically defined, “specification”, to increase the variability of the output.

A Closer Look at Weak Label Learning for Audio Events

This work describes a CNN based approach for weakly supervised training of audio events and describes important characteristics, which naturally arise inweakly supervised learning of sound events, and shows how these aspects of weak labels affect the generalization of models.

The Benefit of Temporally-Strong Labels in Audio Event Classification

  • Shawn HersheyD. Ellis M. Plakal
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
It is shown that fine-tuning with a mix of weak- and strongly-labeled data can substantially improve classifier performance, even when evaluated using only the original weak labels.

Computational Analysis of Sound Scenes and Events

This book presents computational methods for extracting the useful information from audio signals, collecting the state of the art in the field of sound event and scene analysis, and gives an overview of methods for computational analysis of sounds scenes and events.