Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
@article{Mesaros2018DetectionAC, title={Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge}, author={Annamaria Mesaros and Toni Heittola and Emmanouil Benetos and Peter Foster and Mathieu Lagrange and Tuomas Virtanen and Mark D. Plumbley}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, year={2018}, volume={26}, pages={379-393} }
Public evaluation campaigns and datasets promote active development in target research areas, allowing direct comparison of algorithms. [] Key Result The datasets created for and used in DCASE 2016 are publicly available and are a valuable resource for further research.
Figures and Tables from this paper
210 Citations
Sound Event Detection in the DCASE 2017 Challenge
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2019
Analysis of the systems behavior reveals that task-specific optimization has a big role in producing good performance; however, often this optimization closely follows the ranking metric, and its maximization/minimization does not result in universally good performance.
The effect of room acoustics on audio event classification
- Computer Science
- 2019
The impact of mismatches between training and testing conditions in terms of acoustical parameters, including the reverberation time (T60) and the direct-to-reverberant ratio (DRR), on audio classification accuracy and class separability is studied.
Proceedings of the Detection and Classification of Acoustic Scenes and
Events 2019 Workshop (DCASE2019)
- Computer Science
- 2016
The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database and the usage of spatial and harmonic features are shown to improve the performance of SED.
A Review of Deep Learning Based Methods for Acoustic Scene Classification
- Computer Science
- 2020
This article summarizes and groups existing approaches for data preparation, i.e., feature representations, feature pre-processing, and data augmentation, and for data modeling, i.
Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation
- Computer ScienceArXiv
- 2020
This technical report presents a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge, proposing a novel two-stage ASC system leveraging upon ad-hoc score combination of two convolutional neural networks.
Open-Set Acoustic Scene Classification with Deep Convolutional Autoencoders
- Computer ScienceDCASE
- 2019
This paper contains a description of an open-set acoustic scene classification system submitted to task 1C of the Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2019, which consists of a combination of convolutional neural networks for closed-set identification and deep convolved autoencoders for outlier detection.
TASK 3 DCASE 2020: SOUND EVENT LOCALIZATION AND DETECTION USING RESIDUAL SQUEEZE-EXCITATION CNNS Technical Report
- Computer Science
- 2020
This work aims to improve the accuracy results of the baseline CRNN by adding residual squeeze-excitation blocks in the convolutional part of the CRNN, and shows that by simply introducing the residual SE blocks, the results obtained in the development phase clearly exceed the baseline.
A multi-device dataset for urban acoustic scene classification
- Computer ScienceDCASE
- 2018
The acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task are introduced, and the performance of a baseline system in the task is evaluated.
DCASE 2018 Challenge - Task 5: Monitoring of domestic activities based on multi-channel acoustics
- Computer ScienceArXiv
- 2018
The setup of Task 5 is presented which includes the description of the task, dataset and the baseline system, which is intended to lower the hurdle to participate the challenge and to provide a reference performance.
Adaptive Distance-Based Pooling in Convolutional Neural Networks for Audio Event Classification
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2020
A new type of pooling layer is proposed aimed at compensating non-relevant information of audio events by applying an adaptive transformation of the convolutional feature maps in the temporal axis that follows a uniform distance subsampling criterion on the learned feature space.
References
SHOWING 1-10 OF 70 REFERENCES
Sound event detection in synthetic audio: Analysis of the dcase 2016 task results
- Computer Science2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2017
This task, which follows the ‘Event Detection-Office Synthetic’ task of DCASE 2013, studies the behaviour of tested algorithms when facing controlled levels of audio complexity with respect to background noise and polyphony/density.
Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016), Budapest, Hungary, 3 Sep 2016.
- Computer Science
- 2016
The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database and the usage of spatial and harmonic features are shown to improve the performance of SED.
DCASE 2016 Acoustic Scene Classification Using Convolutional Neural Networks
- Computer ScienceDCASE
- 2016
This workshop paper presents the use of a convolutional neural network trained to classify short sequences of audio, represented by their log-mel spectrogram, and proposes a training method that can be used when the system validation performance saturates as the training proceeds.
Exploiting spectro-temporal locality in deep learning based acoustic event detection
- Computer ScienceEURASIP J. Audio Speech Music. Process.
- 2015
Two different feature extraction strategies are explored using multiple resolution spectrograms simultaneously and analyzing the overall and event-wise influence to combine the results, and the use of convolutional neural networks (CNN), a state of the art 2D feature extraction model that exploits local structures, with log power spectrogram input for AED.
CP-JKU SUBMISSIONS FOR DCASE-2016 : A HYBRID APPROACH USING BINAURAL I-VECTORS AND DEEP CONVOLUTIONAL NEURAL NETWORKS
- Computer Science
- 2016
This report describes the 4 submissions for Task 1 (Audio scene classification) of the DCASE-2016 challenge of the CP-JKU team and proposes a novel i-vector extraction scheme for ASC using both left and right audio channels and a Deep Convolutional Neural Network architecture trained on spectrograms of audio excerpts in end-to-end fashion.
CQT-based Convolutional Neural Networks for Audio Scene Classification
- Computer ScienceDCASE
- 2016
It is shown in this paper that a ConstantQ-transformed input to a Convolutional Neural Network improves results and a parallel (graphbased) neural network architecture is proposed which captures relevant audio characteristics both in time and in frequency.
Assessment of human and machine performance in acoustic scene classification: Dcase 2016 case study
- Physics2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2017
Human and machine performance in acoustic scene classification is examined through a parallel experiment using TUT Acoustic Scenes 2016 dataset, and an expert listener trained for the task obtained similar accuracy to the average of submitted systems.
TUT database for acoustic scene classification and sound event detection
- Computer Science, Physics2016 24th European Signal Processing Conference (EUSIPCO)
- 2016
The recording and annotation procedure, the database content, a recommended cross-validation setup and performance of supervised acoustic scene classification system and event detection baseline system using mel frequency cepstral coefficients and Gaussian mixture models are presented.
ACOUSTIC SCENE CLASSIFICATION USING PARALLEL COMBINATION OF LSTM AND CNN
- Computer Science
- 2016
This paper proposes a neural network architecture for the purpose of using sequential information that is composed of two separated lower networks and one upper network and refers to these as LSTM layers, CNN layers and connected layers, respectively.
Score Fusion of Classification Systems for Acoustic Scene Classification
- Computer Science
- 2016
This study explores several methods in three aspects; feature extraction, generative/discriminative machine learning, and score fusion for final decision on the acoustic scene classification task of the IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events.