Joint Measurement of Localization and Detection of Sound Events

@article{Mesaros2019JointMO,
  title={Joint Measurement of Localization and Detection of Sound Events},
  author={Annamaria Mesaros and Sharath Adavanne and Archontis Politis and Toni Heittola and Tuomas Virtanen},
  journal={2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  year={2019},
  pages={333-337}
}
Sound event detection and sound localization or tracking have historically been two separate areas of research. Recent development of sound event detection methods approach also the localization side, but lack a consistent way of measuring the joint performance of the system; instead, they measure the separate abilities for detection and for localization. This paper proposes augmentation of the localization metrics with a condition related to the detection, and conversely, use of location… 

Figures and Tables from this paper

Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019
TLDR
An overview of the first international evaluation on sound event localization and detection, organized as a task of the DCASE 2019 Challenge, presents in detail how the systems were evaluated and ranked and the characteristics of the best-performing systems.
Ensemble of Sequence Matching Networks for Dynamic Sound Event Localization, Detection, and Tracking
TLDR
In order to estimate directions-of-arrival of moving sound sources with higher required spatial resolutions than those of static sources, this work proposes to separate the directional estimates into azimuth and elevation estimates before passing them to the sequence matching network.
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
TLDR
Experimental results indicate polyphony as the main challenge in SELD, due to the difference inulty in detecting all sound events of interest, and the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.
Sound event localization and detection based on crnn using rectangular filters and channel rotation data augmentation
TLDR
The proposed system is a convolutional recurrent neural network using rectangular filters specialized in recognizing significant spectral features related to the task, considerably improving Error Rate and F-score for location-aware detection.
SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform
TLDR
A new framework SoundDet is presented, which is an end-to-end trainable and light-weight framework, for polyphonic moving sound event detection and localization, which consists of a backbone neural network and two parallel heads for temporal detection and spatial localization.
SOUND EVENT LOCALIZATION AND DETECTION BASED ON CRNN USING DENSE RECTANGULAR FILTERS AND CHANNEL ROTATION DATA AUGMENTATION Technical Report
TLDR
Evaluation results on the cross-validation development dataset show that the proposed system outperforms the baseline results, considerably improving Error Rate and F-score for location-aware detection.
SoundDet: Polyphonic Sound Event Detection and Localization from Raw Waveform
TLDR
A new framework SoundDet is presented, which is an end-to-end trainable and light-weight framework, for polyphonic moving sound event detection and localization, which consists of a backbone neural network and two parallel heads for temporal detection and spatial localization.
Papafil: A Low Complexity Sound Event Localization and Detection Method with Parametric Particle Filtering and Gradient Boosting
The present technical report describes the architecture of the system submitted to the DCASE 2020 Challenge Task 3: Sound Event Localization and Detection. The proposed method conforms a low
DCASE 2020 TASK 3: ENSEMBLE OF SEQUENCE MATCHING NETWORKS FOR DYNAMIC SOUND EVENT LOCALIZATION, DETECTION, AND TRACKING Technical Report
TLDR
In order to estimate directions-of-arrival of moving sound sources with high spatial resolution, it is proposed to separate the directional estimations into azimuth and elevation before passing them to the sequence matching network.
Sound Event Localization and Detection Based on Adaptive Hybrid Convolution and Multi-scale Feature Extractor
TLDR
A method based on Adaptive Hybrid Convolution (AHConv) and multi-scale feature extractor to capture the dependencies along with the time dimension and the frequency dimension respectively and an adaptive attention block that can integrate information from very local to exponentially enlarged receptive field within the block is proposed.
...
...

References

SHOWING 1-10 OF 19 REFERENCES
Localization, Detection and Tracking of Multiple Moving Sound Sources with a Convolutional Recurrent Neural Network
TLDR
The results show that the CRNN manages to track multiple sources more consistently than the parametric method across acoustic scenarios, but at the cost of higher localization error.
Detection, classification and localization of acoustic events in the presence of background noise for acoustic surveillance of hazardous situations
TLDR
It is found that the engineered algorithms provide a sufficient robustness in moderately intense noise in order to be applied to practical audio-visual surveillance systems.
Sound based localization and identification in industrial environments
TLDR
The passive sound localization and classification system is designed and implemented and can detect the acoustic signature of power tools and the effectiveness of the system to be used as an early warning system to detect misuse of machinery is demonstrated.
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
TLDR
The proposed convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3-D) space is generic and applicable to any array structures, robust to unseen DOA values, reverberation, and low SNR scenarios.
Metrics for Polyphonic Sound Event Detection
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources
Two-source acoustic event detection and localization: Online implementation in a Smart-room
TLDR
This work implemented online 2-source acoustic event detection and localization algorithms in a Smart-room, a closed space equipped with multiple microphones, showing high recognition accuracy for most of acoustic events both isolated and overlapped with speech.
Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics
TLDR
This work introduces two intuitive and general metrics to allow for objective comparison of tracker characteristics, focusing on their precision in estimating object locations, their accuracy in recognizing object configurations and their ability to consistently label objects over time.
Sound-model-based acoustic source localization using distributed microphone arrays
TLDR
A new source localization technique is proposed that works jointly with an acoustic event detection system and it seems that the proposed model-based approach can be an alternative to current techniques for event-based localization.
A multi-room reverberant dataset for sound event localization and detection
TLDR
This paper presents the sound event localization and detection (SELD) task setup for the DCASE 2019 challenge to detect the temporal activities of a known set of sound event classes, and further localize them in space when active.
A Consistent Metric for Performance Evaluation of Multi-Object Filters
TLDR
This paper outlines the inconsistencies of existing metrics in the context of multi- object miss-distances for performance evaluation, and proposes a new mathematically and intuitively consistent metric that addresses the drawbacks of current multi-object performance evaluation metrics.
...
...