• Corpus ID: 233864527

USM-SED - A Dataset for Polyphonic Sound Event Detection in Urban Sound Monitoring Scenarios

@article{Abeer2021USMSEDA,
  title={USM-SED - A Dataset for Polyphonic Sound Event Detection in Urban Sound Monitoring Scenarios},
  author={Jakob Abe{\ss}er},
  journal={ArXiv},
  year={2021},
  volume={abs/2105.02592}
}
  • J. Abeßer
  • Published 6 May 2021
  • Computer Science
  • ArXiv
This paper introduces a novel dataset for polyphonic sound event detection in urban sound monitoring use-cases. Based on isolated sounds taken from the FSD50K dataset, 20,000 polyphonic soundscapes are synthesized with sounds being randomly positioned in the stereo panorama using different loudness levels. The paper gives a detailed discussion of possible application scenarios, explains the dataset generation process in detail, and discusses current limitations of the proposed USMSED dataset. 

Figures and Tables from this paper

Polyphonic sound event detection for highly dense birdsong scenes

This study shows, using a Convolutional Recurrent Neural Network (CRNN), how birdsong polyphonic scenarios can be detected when dealing with higher polyphony and how effectively this type of model can face a very dense scene with up to 10 overlapping birds.

ARAUS: A Large-Scale Dataset and Baseline Models of Affective Responses to Augmented Urban Soundscapes

The ARAUS (Affective Responses to Augmented Urban Soundscapes) dataset, which comprises a cross-validation set and independent test set totaling 25,440 unique subjective perceptual responses to augmented soundscapes presented as audio-visual stimuli, is made publicly available.

FSD50K: An Open Dataset of Human-Labeled Sound Events

FSD50K is introduced, an open dataset containing over 51 k audio clips totalling over 100 h of audio manually labeled using 200 classes drawn from the AudioSet Ontology, to provide an alternative benchmark dataset and thus foster SER research.

A Strongly-Labelled Polyphonic Dataset of Urban Sounds with Spatiotemporal Context

An accompanying hierarchical label taxonomy is introduced for SINGA: PURA, a strongly labelled polyphonic urban sound dataset with spatiotemporal context designed to be compatible with other existing datasets for urban sound tagging while also able to capture sound events unique to the Singaporean context.

References

SHOWING 1-10 OF 17 REFERENCES

FSD50K: An Open Dataset of Human-Labeled Sound Events

FSD50K is introduced, an open dataset containing over 51 k audio clips totalling over 100 h of audio manually labeled using 200 classes drawn from the AudioSet Ontology, to provide an alternative benchmark dataset and thus foster SER research.

Informing Piano Multi-Pitch Estimation with Inferred Local Polyphony Based on Convolutional Neural Networks

A method for local polyphony estimation (LPE), which is based on convolutional neural networks trained in a supervised fashion to explicitly predict the degree of polyphony, is proposed and results suggest that using explicit LPE information can refine MPE predictions.

DESED-FL and URBAN-FL: Federated Learning Datasets for Sound Event Detection

The results indicate that FL is a promising approach for SED, but faces challenges with divergent data distributions inherent to distributed client edge devices.

Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

An overview of the first international evaluation on sound event localization and detection, organized as a task of the DCASE 2019 Challenge, presents in detail how the systems were evaluated and ranked and the characteristics of the best-performing systems.

Environmental sound segmentation utilizing Mask U-Net

An environmental sound segmentation method which combines segmentation using U-Net with sound event detection using CNN to 75-classes of environmental sounds is proposed, which improved learning speed and sound source separation compared with the conventional method.

Urban Noise Monitoring in the Stadtlärm Project - A Field Report

The experiences made during the field test of the Stadtlärm system for distributed noise measurement in summer/fall of 2018 in Jena, Germany are summarized.

A Framework for the Robust Evaluation of Sound Event Detection

A new framework for performance evaluation of polyphonic sound event detection (SED) systems is defined, which overcomes the limitations of the conventional collar-based event decisions, event F-scores and event error rates and introduces a definition of event detection that is more robust against labelling subjectivity.

A Survey: Neural Network-Based Deep Learning for Acoustic Event Detection

How deep learning methods benefit the acoustic event detection task and the potential issues that need to be addressed for prospective real-world scenarios are discussed.

Computational Analysis of Sound Scenes and Events

This book presents computational methods for extracting the useful information from audio signals, collecting the state of the art in the field of sound event and scene analysis, and gives an overview of methods for computational analysis of sounds scenes and events.

Musical Source Separation: An Introduction

This chapter discusses how to upmix a two-channel stereo recording to a 5.1-channel surround sound system, and how to change the spatial location of a musical instrument within the mix.