Weakly supervised CRNN system for sound event detection with large-scale unlabeled in-domain data
@article{Wang2018WeaklySC, title={Weakly supervised CRNN system for sound event detection with large-scale unlabeled in-domain data}, author={Dezhi Wang and Lilun Zhang and Chang-chun Bao and Kele Xu and Boqing Zhu and Qiuqiang Kong}, journal={ArXiv}, year={2018}, volume={abs/1811.00301} }
Sound event detection (SED) is typically posed as a supervised learning problem requiring training data with strong temporal labels of sound events. However, the production of datasets with strong labels normally requires unaffordable labor cost. It limits the practical application of supervised SED methods. The recent advances in SED approaches focuses on detecting sound events by taking advantages of weakly labeled or unlabeled training data. In this paper, we propose a joint framework to…
4 Citations
Multi Model-Based Distillation for Sound Event Detection
- Computer ScienceIEICE Trans. Inf. Syst.
- 2019
This letter proposes a novel multi modelbased distillation approach for sound event detection by making use of the knowledge from models of multiple teachers which are complementary in detecting sound events.
Noise Robust Sound Event Detection Using Deep Learning and Audio Enhancement
- Computer Science2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)
- 2019
A unified approach to sound event detection is proposed that takes the advantage of both deep learning and audio enhancement, and a convolutional recurrent neural network is combined with a deep neural network to improve the performance of the SED classifiers and an optimally modified log-spectral amplitude estimator based audio enhancement method is employed.
Multi-Representation Knowledge Distillation For Audio Classification
- Computer ScienceMultim. Tools Appl.
- 2022
A novel end-to-end collaborative learning framework that takes multiple representations as the input to train the models in parallel and can improve the classification performance and achieve state-of-the-art results on both acoustic scene classification tasks and general audio tagging tasks.
A Mobile Application for Sound Event Detection
- Computer ScienceIJCAI
- 2019
The architecture of the solution includes offline training and online detection, which includes acquisition of sensor data, processing of audio signals, and detecting and recording of sound events.
References
SHOWING 1-10 OF 29 REFERENCES
Sound event detection using weakly-labeled semi-supervised data with GCRNNS, VAT and Self-Adaptive Label Refinement
- Computer ScienceDCASE
- 2018
A gated convolutional recurrent neural network based approach to solve task 4, large-scale weakly labelled semi-supervised sound event detection in domestic environments, of the DCASE 2018 challenge and introduces self-adaptive label refinement, a method which allows unsupervised adaption of the trained system to refine the accuracy of frame-level class predictions.
Adaptive Pooling Operators for Weakly Labeled Sound Event Detection
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2018
This paper treats SED as a multiple instance learning (MIL) problem, where training labels are static over a short excerpt, indicating the presence or absence of sound sources but not their temporal locality, and develops a family of adaptive pooling operators—referred to as autopool—which smoothly interpolate between common pooling Operators, and automatically adapt to the characteristics of the sound sources in question.
Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network
- Computer ScienceDCASE
- 2017
A stacked convolutional and recurrent neural network with two prediction layers in sequence one for the strong followed by the weak label, which achieves the best error rate of 0.84 for strong labels and F-score of 43.3% for weak labels on the unseen test split is proposed.
LARGE-SCALE WEAKLY LABELLED SEMI-SUPERVISED CQT BASED SOUND EVENT DETECTION IN DOMESTIC ENVIRONMENTS Technical Report
- Computer Science
- 2018
This paper proposes a constant quality transform based input feature for baseline architecture to learn the start and end time of sound events (strong labels) in an audio recording given just the…
Sound event detection from weak annotations: weighted-GRU versus multi-instance-learning
- Computer ScienceDCASE
- 2018
This paper addresses the detection of audio events in domestic environments in the case where a weakly annotated dataset is available for training, and explores two approaches: a ”weighted-GRU” (WGRU), in which a Convolutional Recurrent Neural Network is trained for classification and then exploited at the output of the time-distributed dense layer to perform localization.
FRAMECNN : A WEAKLY-SUPERVISED LEARNING FRAMEWORK FOR FRAME-WISE ACOUSTIC EVENT DETECTION AND CLASSIFICATION
- Computer Science
- 2017
In this paper, we describe our contribution to the challenge of detection and classification of acoustic scenes and events (DCASE2017). We propose framCNN, a novel weakly-supervised learning…
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
In this paper, we present a gated convolutional neural network and a temporal attention-based localization method for audio classification, which won the 1st place in the large-scale weakly…
A joint detection-classification model for audio tagging of weakly labelled data
- Computer Science2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2017
This work proposes a joint detection-classification (JDC) model to detect and classify the audio clip simultaneously and shows that the JDC model reduces the equal error rate (EER) from 19.0% to 16.9%.
A Joint Separation-Classification Model for Sound Event Detection of Weakly Labelled Data
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
A joint separation-classification model trained only on weakly labelled audio data, that is, only the tags of an audio recording are known but the time of the events are unknown is proposed, outperforming deep neural network baseline of 0.29.
Orthogonality-Regularized Masked NMF for Learning on Weakly Labeled Audio Data
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
It is demonstrated that the proposed Orthogonality-Regularized Masked NMF (ORM-NMF) can be used for Audio Event Detection of rare events and evaluated on the development data from Task2 of DCASE2017 Challenge.