Corpus ID: 102352325

Cross-task learning for audio tagging, sound event detection and spatial localization: DCASE 2019 baseline systems

@article{Kong2019CrosstaskLF,
  title={Cross-task learning for audio tagging, sound event detection and spatial localization: DCASE 2019 baseline systems},
  author={Qiuqiang Kong and Yin Cao and Turab Iqbal and Yong Xu and Wenwu Wang and Mark D. Plumbley},
  journal={ArXiv},
  year={2019},
  volume={abs/1904.05635}
}
The Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge focuses on audio tagging, sound event detection and spatial localisation. DCASE 2019 consists of five tasks: 1) acoustic scene classification, 2) audio tagging with noisy labels and minimal supervision, 3) sound event localisation and detection, 4) sound event detection in domestic environments, and 5) urban sound tagging. In this paper, we propose generic cross-task baseline systems based on convolutional… Expand
Sound event detection and localization based on CNN and LSTM Technical Report
The Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge is a topic seminar for speech feature classification. Task 3 is the location and detection of sound events. InExpand
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
  • Dawei Liang, Yangyang Shi, +6 authors M. Seltzer
  • Computer Science, Engineering
  • 2021
TLDR
A dual-branch neural network architecture is developed for the joint learning of voice and acoustic features during an AED process and thorough empirical studies are conducted to examine the performance on the public AudioSet with different types of inputs. Expand
Domain Adaptation Neural Network for Acoustic Scene Classification in Mismatched Conditions
TLDR
The proposed DANN based acoustic scene classification method is evaluated on the subtask B of task 1 of the DCASE 2019 ASC challenge, which is a closed-set classification problem whose audio recordings were recorded by mismatch devices. Expand
Open-Set Acoustic Scene Classification with Deep Convolutional Autoencoders
TLDR
This paper contains a description of an open-set acoustic scene classification system submitted to task 1C of the Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2019, which consists of a combination of convolutional neural networks for closed-set identification and deep convolved autoencoders for outlier detection. Expand
TWO-STAGE SOUND EVENT LOCALIZATION AND DETECTION USING INTENSITY VECTOR AND GENERALIZED CROSS-CORRELATION Technical Report
Sound event localization and detection (SELD) refers to the spatial and temporal localization of sound events in addition to classification. The Detection and Classification of Acoustic Scenes andExpand
WDXY SUBMISSION FOR DCASE-2019 : ACOUSTIC SCENE CLASSIFICATION WITH CONVOLUTION NEURAL NETWORKS
Acoustic Scene Classification (ASC) is the task of identifying the scene from which the audio signal is recorded. It is one of the core research problems in the field of Computational Sound SceneExpand
Robust Acoustic Scene Classification using a Multi-Spectrogram Encoder-Decoder Framework
TLDR
The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel C-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. Expand
Augmented Strategy For Polyphonic Sound Event Detection
  • Bolun Wang, Zhonghua Fu, Hao Wu
  • Computer Science
  • 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  • 2019
TLDR
An augmented strategy for polyphonic sound event classification that includes data augmentation to enrich training set to eliminate data unbalance, a new loss function that combines cross entropy and F-score, and model fusion to integrate the powers of different classifiers is proposed. Expand
TIME-FREQUENCY SEGMENTATION ATTENTION NEURAL NETWORK FOR URBAN SOUND TAGGING Technical Report
Audio tagging aims to assign one or more labels to the audio clip. In this task, we used the Time-Frequency Segmentation Attention Network (TFSANN) for urban sound tagging. In the training, the logExpand
URBAN SOUND TAGGING USING CONVOLUTIONAL NEURAL NETWORKS Technical Report
This technical report outlines our solution to Task 5 of the DCASE 2019 challenge, titled Urban Sound Tagging. The objective of the task is to label different sources of noise from raw audio data. AExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 31 REFERENCES
DCASE 2018 Challenge baseline with convolutional neural networks
TLDR
Python implementation of DCASE 2018 has five tasks: 1) Acoustic scene classification, 2) General-purpose audio tagging, 3) Bird audio detection, 4) Weakly-labeled semi-supervised sound event detection and 5) Multi-channel audio tagging; the baseline source code contains the implementation of convolutional neural networks, including AlexNetish and VGGish -- networks originating from computer vision. Expand
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network
In this paper, we present a gated convolutional neural network and a temporal attention-based localization method for audio classification, which won the 1st place in the large-scale weaklyExpand
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
TLDR
The proposed convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3-D) space is generic and applicable to any array structures, robust to unseen DOA values, reverberation, and low SNR scenarios. Expand
Audio tagging with noisy labels and minimal supervision
TLDR
This paper presents the task setup, the FSDKaggle2019 dataset prepared for this scientific evaluation, and a baseline system consisting of a convolutional neural network. Expand
A multi-device dataset for urban acoustic scene classification
TLDR
The acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task are introduced, and the performance of a baseline system in the task is evaluated. Expand
Learning Sound Event Classifiers from Web Audio with Noisy Labels
TLDR
Experiments suggest that training with large amounts of noisy data can outperform training with smaller amounts of carefully-labeled data, and it is shown that noise-robust loss functions can be effective in improving performance in presence of corrupted labels. Expand
DCASE 2017 Challenge setup: Tasks, datasets and baseline system
TLDR
This paper presents the setup of these tasks: task definition, dataset, experimental setup, and baseline system results on the development dataset. Expand
TUT database for acoustic scene classification and sound event detection
TLDR
The recording and annotation procedure, the database content, a recommended cross-validation setup and performance of supervised acoustic scene classification system and event detection baseline system using mel frequency cepstral coefficients and Gaussian mixture models are presented. Expand
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
TLDR
This work combines these two approaches in a convolutional recurrent neural network (CRNN) and applies it on a polyphonic sound event detection task and observes a considerable improvement for four different datasets consisting of everyday sound events. Expand
Deep Neural Network Baseline for DCASE Challenge 2016
TLDR
The DCASE Challenge 2016 contains tasks for Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), and audio tagging, and DNN baselines indicate that DNNs can be successful in many of these tasks, but may not always perform better than the baselines. Expand
...
1
2
3
4
...