• Corpus ID: 219260243

A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection

@inproceedings{Politis2020ADO,
  title={A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection},
  author={Archontis Politis and Sharath Adavanne and Tuomas Virtanen},
  booktitle={DCASE},
  year={2020}
}
This report presents the dataset and the evaluation setup of the Sound Event Localization & Detection (SELD) task for the DCASE 2020 Challenge. The SELD task refers to the problem of trying to simultaneously classify a known set of sound event classes, detect their temporal activations, and estimate their spatial directions or locations while they are active. To train and test SELD systems, datasets of diverse sound events occurring under realistic acoustic conditions are needed. Compared to… 

Figures and Tables from this paper

SOUND EVENT DETECTION AND LOCALIZATION USING CRNN MODELS Technical Report
Sound Event Localization and Detection (SELD) requires both spatial and temporal information of sound events that appears in an acoustic event. The sound event localization and detection DCASE2020
A COMBINATION OF VARIOUS NEURAL NETWORKS FOR SOUND EVENT LOCALIZATION AND DETECTION Technical Report
This technical report describes our approach to the DCASE 2021 task 3: Sound Event Localization and Detection (SELD). We propose a network architecture, a combination of various network layers, which
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection
TLDR
A novel four-stage data augmentation approach to ResNet-Conformer based acoustic modeling for sound event localization and detection (SELD) that employs a ResNetConformer architecture to model both global and local context dependencies of an audio sequence to yield further gains over those architectures used in the DCASE 2020 SELD evaluations.
Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection
TLDR
In experimental evaluations with the DCASE 2020 Task 3 dataset, the ACCDOA representation outperformed the two-branch representation in SELD metrics with a smaller network size and performed better than state-of-the-art SELD systems in terms of localization and location-dependent detection.
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection
TLDR
A novel feature called Spatial cueAugmented Log-SpectrogrAm (SALSA) with exact time-frequency mapping between the signal power and the source directional cues, which is crucial for resolving overlapping sound sources is proposed.
LOCALIZATION AND DETECTION FOR MOVING SOUND SOURCES USING CONSECUTIVE ENSEMBLE OF 2D-CRNN Technical Report
This technical report introduces a deep learning strategy for sound event localization and detection in DCASE 2020 Task 3. This strategy is designed to get accurate estimation of both detecting and
A General Network Architecture for Sound Event Localization and Detection Using Transfer Learning and Recurrent Neural Network
TLDR
The experimental results using the DCASE 2020 SELD dataset show that the performances of the proposed network architecture using different SED and DOA estimation algorithms and different audio formats are competitive with other state-of-the-art SELD algorithms.
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
TLDR
Experimental results indicate polyphony as the main challenge in SELD, due to the difficulty in detecting all sound events of interest, and the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.
A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection
TLDR
To investigate the individual and combined effects of ambient noise, interferers, and reverberation, the performance of the baseline on different versions of the dataset excluding or including combinations of these factors indicates that by far the most detrimental effects are caused by directional interferers.
A Model Ensemble Approach for Sound Event Localization and Detection
TLDR
A more robust prediction of SED and DOA is obtained by model ensemble and post-processing and ranks the first place in DCASE 2020 task3 challenge.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 28 REFERENCES
Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy
TLDR
Experimental results show that the proposed two-stage polyphonic sound event detection and localization method is able to improve the performance of both SED and DOAE, and also performs significantly better than the baseline method.
Classification of Spatial Audio Location and Content Using Convolutional Neural Networks
Sound Event Detection in the DCASE 2017 Challenge
TLDR
Analysis of the systems behavior reveals that task-specific optimization has a big role in producing good performance; however, often this optimization closely follows the ranking metric, and its maximization/minimization does not result in universally good performance.
A multi-room reverberant dataset for sound event localization and detection
TLDR
This paper presents the sound event localization and detection (SELD) task setup for the DCASE 2019 challenge to detect the temporal activities of a known set of sound event classes, and further localize them in space when active.
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
TLDR
The proposed convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3-D) space is generic and applicable to any array structures, robust to unseen DOA values, reverberation, and low SNR scenarios.
A Hybrid Parametric-Deep Learning Approach for Sound Event Localization and Detection
TLDR
The proposed methodology relies on parametric spatial audio analysis for source localization and detection, combined with a deep learning-based monophonic event classifier, to reduce the localization error on the evaluation dataset.
CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings
TLDR
This work proposes to use a neural network built from stacked convolutional and recurrent layers in order to estimate the directions of arrival of multiple sources from a first-order Ambisonics recording, using features derived from the acoustic intensity vector as inputs.
First Order Ambisonics Domain Spatial Augmentation for DNN-based Direction of Arrival Estimation
TLDR
A novel data augmentation method for training neural networks for Direction of Arrival (DOA) estimation by expanding the representation of the DOA subspace of a dataset.
GCC-PHAT Cross-Correlation Audio Features for Simultaneous Sound Event Localization and Detection (SELD) on Multiple Rooms
In this work, we show a simultaneous sound event localization and detection (SELD) system, with enhanced acoustic features, in which we propose using the well-known Generalized Cross Correlation
Hierarchical Detection of Sound Events and their Localization Using Convolutional Neural Networks with Adaptive Thresholds
TLDR
The system is based on multi-channel convolutional neural networks, combined with data augmentation and ensembling, and follows a hierarchical approach that first determines adaptive thresholds for the multi-label sound event detection (SED) problem, based on a CNN operating on spectrograms over longduration windows.
...
1
2
3
...