Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)
@inproceedings{Plumbley2016ProceedingsOT, title={Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)}, author={Mark D. Plumbley and Christian Kroos and Juan Pablo Bello and Ga{\"e}l Richard and Daniel P. W. Ellis and Annamaria Mesaros}, year={2016} }
In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have many overlapping sound events, making it hard to recognize with just mono channel audio. Human listeners have been successfully recognizing the mixture of overlapping sound events using pitch cues and exploiting the stereo (multichannel) audio signal available…
Figures and Tables from this paper
29 Citations
Acoustic Scene Classification Using Deep Audio Feature and BLSTM Network
- Computer Science2018 International Conference on Audio, Language and Image Processing (ICALIP)
- 2018
This work presents the work of acoustic scene classification for the challenge of the Detection and Classification of Acoustic Scenes and Events 2017, i.e., DCASE2017 challenge, using a feature of Deep Audio Feature (DAF) for acoustic scene representation and a classifier of Bidirectional Long Short Term Memory (BLSTM) network foroustic scene classification.
Sound Event Detection Using Multiple Optimized Kernels
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2020
Experimental results on different subsets of AudioSet demonstrate the performance of the proposed approach compared to state-of-the-art systems.
GCC-PHAT Cross-Correlation Audio Features for Simultaneous Sound Event Localization and Detection (SELD) on Multiple Rooms
- Computer Science, PhysicsDCASE
- 2019
In this work, we show a simultaneous sound event localization and detection (SELD) system, with enhanced acoustic features, in which we propose using the well-known Generalized Cross Correlation…
An Xception Residual Recurrent Neural Network for Audio Event Detection and Tagging
- Computer Science
- 2018
The Xception Stacked Residual Recurrent Neural Network (XRRNN) is proposed, based on modifications of the system CVSSP by Xu et al. (2017), that won the challenge for the AT task.
Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification
- PhysicsICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
Analysis of detailed experimental results reveals that (1) long-duration sounds are generally most informative for acoustic scene classification; and (2) the focus of sound duration may be different for classifying different types of acoustic scenes.
Acoustic Scene Classification Using Audio Tagging
- Computer ScienceINTERSPEECH
- 2020
A novel scheme for acoustic scene classification which adopts an audio tagging system inspired by the human perception mechanism, which shows effectiveness on the detection and classification of acoustic scenes and events 2019 task 1-a dataset.
Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events
- Computer ScienceACM Multimedia
- 2019
This work modify this attention-based feedforward structure in such a way that allows the resulting model to use audio as well as video to compute sound event predictions, and makes a compelling case for devoting more attention to research in multimodal audiovisual classification.
Audio Sound Determination Using Feature Space Attention Based Convolution Recurrent Neural Network
- Computer Science, PhysicsICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
Improved performance on the latest TUT Sound Event 2017 dataset demonstrate the improved performance of the proposed feature space attention based convolution recurrent neural network approach utilizing the varying importance of each feature dimension to perform acoustic event detection.
DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and events
- Computer Science
- 2020
This study proposes an integrated deep neural network that can perform three tasks: acoustic scene classification, audio tagging, and sound event detection and shows that the proposed system, DCASENet, itself can be directly used for any tasks with competitive results, or it can be further finetuned for the target task.
Progressive Training Of Convolutional Neural Networks For Acoustic Events Classification
- Computer Science2020 28th European Signal Processing Conference (EUSIPCO)
- 2021
Experimental results suggest that progressive resizing methods improves the performances of audio events classification models as well as introducing a complimentary gain in performances with respect to the original technique.
References
SHOWING 1-10 OF 493 REFERENCES
Exploiting spectro-temporal locality in deep learning based acoustic event detection
- Computer ScienceEURASIP J. Audio Speech Music. Process.
- 2015
Two different feature extraction strategies are explored using multiple resolution spectrograms simultaneously and analyzing the overall and event-wise influence to combine the results, and the use of convolutional neural networks (CNN), a state of the art 2D feature extraction model that exploits local structures, with log power spectrogram input for AED.
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2018
The emergence of deep learning as the most popular classification method is observed, replacing the traditional approaches based on Gaussian mixture models and support vector machines.
Rare Sound Event Detection Using 1D Convolutional Recurrent Neural Networks
- Computer ScienceDCASE
- 2017
The proposed system using combination of 1D convolutional neural network and recurrent neural network (RNN) with long shortterm memory units (LSTM) has achieved the 1st place in the challenge with an error rate of 0.13 and an F-Score of 93.1.
ENSEMBLE OF CONVOLUTIONAL NEURAL NETWORKS FOR WEAKLY-SUPERVISED SOUND EVENT DETECTION USING MULTIPLE SCALE INPUT
- Computer Science
- 2017
The proposed model, an ensemble of convolutional neural networks to detect audio events in the automotive environment, achieved the 2nd place on audio tagging and the 1st place on sound event detection.
DEEP SEQUENTIAL IMAGE FEATURES FOR ACOUSTIC SCENE CLASSIFICATION
- Computer Science
- 2017
This work proposes a novel method to classify 15 different acoustic scenes using deep sequential learning, based on features extracted from Short-Time Fourier Transform and scalogram of the audio scenes using Convolutional Neural Networks.
Improved audio features for large-scale multimedia event detection
- Computer Science2014 IEEE International Conference on Multimedia and Expo (ICME)
- 2014
While the overall finding is that MFCC features perform best, it is found that ANN as well as LSP features provide complementary information at various levels of temporal resolution.
Convolutional Recurrent Neural Networks for Rare Sound Event Detection
- Computer ScienceDCASE
- 2017
A convolutional recurrent neural network (CRNN) is proposed for rare sound event detection that provides significant performance improvement over two other deep learning based methods mainly due to its capability of longer term temporal modeling.
Recurrent neural networks for polyphonic sound event detection in real life recordings
- Computer Science2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single…
Detection of overlapping acoustic events using a temporally-constrained probabilistic model
- Computer Science2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
Results show that the proposed system outperforms several state-of-the-art methods for overlapping acoustic event detection on the same task, using both frame-based and event-based metrics, and is robust to varying event density and noise levels.
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
In this paper, we present a gated convolutional neural network and a temporal attention-based localization method for audio classification, which won the 1st place in the large-scale weakly…