Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

@inproceedings{Plumbley2016ProceedingsOT,
  title={Proceedings of the Detection and Classification of Acoustic Scenes and
 Events 2019 Workshop (DCASE2019)},
  author={Mark D. Plumbley and Christian Kroos and Juan Pablo Bello and Ga{\"e}l Richard and Daniel P. W. Ellis and Annamaria Mesaros},
  year={2016}
}
In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have many overlapping sound events, making it hard to recognize with just mono channel audio. Human listeners have been successfully recognizing the mixture of overlapping sound events using pitch cues and exploiting the stereo (multichannel) audio signal available… 

Figures and Tables from this paper

Acoustic Scene Classification Using Deep Audio Feature and BLSTM Network
TLDR
This work presents the work of acoustic scene classification for the challenge of the Detection and Classification of Acoustic Scenes and Events 2017, i.e., DCASE2017 challenge, using a feature of Deep Audio Feature (DAF) for acoustic scene representation and a classifier of Bidirectional Long Short Term Memory (BLSTM) network foroustic scene classification.
Sound Event Detection Using Multiple Optimized Kernels
TLDR
Experimental results on different subsets of AudioSet demonstrate the performance of the proposed approach compared to state-of-the-art systems.
GCC-PHAT Cross-Correlation Audio Features for Simultaneous Sound Event Localization and Detection (SELD) on Multiple Rooms
In this work, we show a simultaneous sound event localization and detection (SELD) system, with enhanced acoustic features, in which we propose using the well-known Generalized Cross Correlation
An Xception Residual Recurrent Neural Network for Audio Event Detection and Tagging
TLDR
The Xception Stacked Residual Recurrent Neural Network (XRRNN) is proposed, based on modifications of the system CVSSP by Xu et al. (2017), that won the challenge for the AT task.
Time-Frequency Feature Decomposition Based on Sound Duration for Acoustic Scene Classification
  • Yuzhong Wu, Tan Lee
  • Physics
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
TLDR
Analysis of detailed experimental results reveals that (1) long-duration sounds are generally most informative for acoustic scene classification; and (2) the focus of sound duration may be different for classifying different types of acoustic scenes.
Acoustic Scene Classification Using Audio Tagging
TLDR
A novel scheme for acoustic scene classification which adopts an audio tagging system inspired by the human perception mechanism, which shows effectiveness on the detection and classification of acoustic scenes and events 2019 task 1-a dataset.
Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events
TLDR
This work modify this attention-based feedforward structure in such a way that allows the resulting model to use audio as well as video to compute sound event predictions, and makes a compelling case for devoting more attention to research in multimodal audiovisual classification.
Audio Sound Determination Using Feature Space Attention Based Convolution Recurrent Neural Network
TLDR
Improved performance on the latest TUT Sound Event 2017 dataset demonstrate the improved performance of the proposed feature space attention based convolution recurrent neural network approach utilizing the varying importance of each feature dimension to perform acoustic event detection.
DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and events
TLDR
This study proposes an integrated deep neural network that can perform three tasks: acoustic scene classification, audio tagging, and sound event detection and shows that the proposed system, DCASENet, itself can be directly used for any tasks with competitive results, or it can be further finetuned for the target task.
Progressive Training Of Convolutional Neural Networks For Acoustic Events Classification
TLDR
Experimental results suggest that progressive resizing methods improves the performances of audio events classification models as well as introducing a complimentary gain in performances with respect to the original technique.
...
1
2
3
...

References

SHOWING 1-10 OF 493 REFERENCES
Exploiting spectro-temporal locality in deep learning based acoustic event detection
TLDR
Two different feature extraction strategies are explored using multiple resolution spectrograms simultaneously and analyzing the overall and event-wise influence to combine the results, and the use of convolutional neural networks (CNN), a state of the art 2D feature extraction model that exploits local structures, with log power spectrogram input for AED.
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
TLDR
The emergence of deep learning as the most popular classification method is observed, replacing the traditional approaches based on Gaussian mixture models and support vector machines.
Rare Sound Event Detection Using 1D Convolutional Recurrent Neural Networks
TLDR
The proposed system using combination of 1D convolutional neural network and recurrent neural network (RNN) with long shortterm memory units (LSTM) has achieved the 1st place in the challenge with an error rate of 0.13 and an F-Score of 93.1.
ENSEMBLE OF CONVOLUTIONAL NEURAL NETWORKS FOR WEAKLY-SUPERVISED SOUND EVENT DETECTION USING MULTIPLE SCALE INPUT
TLDR
The proposed model, an ensemble of convolutional neural networks to detect audio events in the automotive environment, achieved the 2nd place on audio tagging and the 1st place on sound event detection.
DEEP SEQUENTIAL IMAGE FEATURES FOR ACOUSTIC SCENE CLASSIFICATION
TLDR
This work proposes a novel method to classify 15 different acoustic scenes using deep sequential learning, based on features extracted from Short-Time Fourier Transform and scalogram of the audio scenes using Convolutional Neural Networks.
Improved audio features for large-scale multimedia event detection
TLDR
While the overall finding is that MFCC features perform best, it is found that ANN as well as LSP features provide complementary information at various levels of temporal resolution.
Convolutional Recurrent Neural Networks for Rare Sound Event Detection
TLDR
A convolutional recurrent neural network (CRNN) is proposed for rare sound event detection that provides significant performance improvement over two other deep learning based methods mainly due to its capability of longer term temporal modeling.
Recurrent neural networks for polyphonic sound event detection in real life recordings
In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single
Detection of overlapping acoustic events using a temporally-constrained probabilistic model
TLDR
Results show that the proposed system outperforms several state-of-the-art methods for overlapping acoustic event detection on the same task, using both frame-based and event-based metrics, and is robust to varying event density and noise levels.
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network
In this paper, we present a gated convolutional neural network and a temporal attention-based localization method for audio classification, which won the 1st place in the large-scale weakly
...
1
2
3
4
5
...