• Corpus ID: 35531751

A report on sound event detection with different binaural features

@article{Adavanne2017ARO,
  title={A report on sound event detection with different binaural features},
  author={Sharath Adavanne and Tuomas Virtanen},
  journal={ArXiv},
  year={2017},
  volume={abs/1710.02997}
}
In this paper, we compare the performance of using binaural audio features in place of single-channel features for sound event detection. Three different binaural features are studied and evaluated on the publicly available TUT Sound Events 2017 dataset of length 70 minutes. Sound event detection is performed separately with single-channel and binaural features using stacked convolutional and recurrent neural network and the evaluation is reported using standard metrics of error rate and F… 

Figures and Tables from this paper

Multichannel Sound Event Detection Using 3D Convolutional Neural Networks for Learning Inter-channel Features
TLDR
The proposed method learns to recognize overlapping sound events from multichannel features faster and performs better SED with a fewer number of training epochs.
Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)
TLDR
The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database and the usage of spatial and harmonic features are shown to improve the performance of SED.
A REPORT ON SOUND EVENT LOCALIZATION AND DETECTION Technical Report
TLDR
The studied neutral network with noise on input data are seen to consistently perform equal to the origin baseline with respect to error rate metric.
A survey of deep learning for polyphonic sound event detection
TLDR
A review of the SED problem is presented and different deep learning approaches for the problem are discussed, which can be seen in Detection and Classification of Acoustic Scenes and Events (DCASE) challenge 2016–2017.
Convolutional Neural Networks with Multi-task Loss for Polyphonic Sound Event Detection
TLDR
A multi-task loss function is used to couple with different neural networks and apply it to a polyphonic sound event detection task and it is compared with DNN, CNN and CBRNN methods.
Sound Event Detection Using Multiple Optimized Kernels
TLDR
Experimental results on different subsets of AudioSet demonstrate the performance of the proposed approach compared to state-of-the-art systems.
Sound Event Detection in the DCASE 2017 Challenge
TLDR
Analysis of the systems behavior reveals that task-specific optimization has a big role in producing good performance; however, often this optimization closely follows the ranking metric, and its maximization/minimization does not result in universally good performance.
Audio Sound Determination Using Feature Space Attention Based Convolution Recurrent Neural Network
TLDR
Improved performance on the latest TUT Sound Event 2017 dataset demonstrate the improved performance of the proposed feature space attention based convolution recurrent neural network approach utilizing the varying importance of each feature dimension to perform acoustic event detection.
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
TLDR
The proposed convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3-D) space is generic and applicable to any array structures, robust to unseen DOA values, reverberation, and low SNR scenarios.
THREE-STAGE APPROACH FOR SOUND EVENT LOCALIZATION AND DETECTION Technical Report
TLDR
This paper describes a three-stage approach system for sound event localization and detection (SELD) task, which employs the multi-resolution cochleagram from 4-channel audio and convolutional recurrent neural network (CRNN) model to detect sound activity.
...
...

References

SHOWING 1-10 OF 24 REFERENCES
Sound event detection using spatial features and convolutional recurrent neural network
TLDR
This paper proposes to use low-level spatial features extracted from multichannel audio for sound event detection and shows that instead of concatenating the features of each channel into a single feature vector the network learns sound events in multich channel audio better when they are presented as separate layers of a volume.
Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features
TLDR
The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database and the usage of spatial and harmonic features are shown to improve the performance of SED.
Polyphonic sound event detection using multi label deep neural networks
TLDR
Frame-wise spectral-domain features are used as inputs to train a deep neural network for multi label classification in this work and the proposed method improves the accuracy by 19% percentage points overall.
Robust sound event recognition using convolutional neural networks
TLDR
This work proposes novel features derived from spectrogram energy triggering, allied with the powerful classification capabilities of a convolutional neural network (CNN), which demonstrates excellent performance under noise-corrupted conditions when compared against state-of-the-art approaches on standard evaluation tasks.
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
TLDR
This work combines these two approaches in a convolutional recurrent neural network (CRNN) and applies it on a polyphonic sound event detection task and observes a considerable improvement for four different datasets consisting of everyday sound events.
Metrics for Polyphonic Sound Event Detection
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources
Recurrent neural networks for polyphonic sound event detection in real life recordings
In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single
Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks
TLDR
A model based on convolutional neural networks that relies only on weakly-supervised data for training and is able to detect frame-level information, e.g., the temporal position of sounds, even when it is trained merely with clip-level labels.
Broadband doa estimation using convolutional neural networks trained with noise signals
TLDR
Through experimental evaluation, the ability of the proposed noise trained CNN framework to generalize to speech sources is demonstrated and the robustness of the system to noise, small perturbations in microphone positions, as well as its ability to adapt to different acoustic conditions is investigated.
Acoustic event detection in real life recordings
TLDR
A system for acoustic event detection in recordings from real life environments using a network of hidden Markov models, capable of recognizing almost one third of the events, and the temporal positioning of the Events is not correct for 84% of the time.
...
...