• Corpus ID: 221088318

MEAN TEACHER WITH DATA AUGMENTATION FOR DCASE 2019 TASK 4 Technical Report

@inproceedings{DelphinPoulat2019MEANTW,
  title={MEAN TEACHER WITH DATA AUGMENTATION FOR DCASE 2019 TASK 4 Technical Report},
  author={Lionel Delphin-Poulat and Cyril Plapous},
  year={2019}
}
In this paper, we present our neural network for the DCASE 2019 challenge’s Task 4 (Sound event detection in domestic environments) [1]. The goal of the task is to evaluate systems for the detection of sound events using real data either weakly labeled or unlabeled and simulated data that is strongly labeled. We propose a mean-teacher model with convolutional neural network (CNN) and recurrent neural network (RNN) together with data augmentation and a median window tuned for each class based on… 

Tables from this paper

MEAN TEACHER WITH SOUND SOURCE SEPARATION AND DATA AUGMENTATION FOR DCASE 2020 TASK 4 Technical Report
TLDR
A mean-teacher model with convolutional and recurrent neural network structure and adopt data augmentation and sound source separation technique to improve the performance of sound event detection is proposed.
MULTI-SCALE RESIDUAL CRNN WITH DATA AUGMENTATION FOR DCASE 2020 TASK 4 Technical Report
TLDR
This technical report improves the baseline by using a variety of data augmentation methods and synthesizing more complex synthetic data for training and presents multiscale residual convolutional recurrent neural network (CRNN) to solve the problem of multi-scale detection.
SEMI-SUPERVISED SOUND EVENT DETECTION BASED ON MEAN TEACHER WITH POWER POOLING AND DATA AUGMENTATION Technical Report
TLDR
The details of the system submitted to DCASE2020 task4: sound event detection (SED) and separation in domestic environments, which mainly focuses on the scenario that recognizes sound events without source separation is described.
ADAPTIVE FOCAL LOSS WITH DATA AUGMENTATION FOR SEMI-SUPERVISED SOUND EVENT DETECTION Technical Report
TLDR
This technical report describes the submission system for DCASE2021 Task4: sound event detection and separation in domestic environments, and proposes to use various methods such as the specaugment data augmentation method, adaptive focal loss, event specific post-processing to improve the performance.
TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS: DCASE 2021 CHALLENGE, TASK 4 Technical Report
TLDR
To overcome the challenge of using unlabeled data, the submitted approach used semi-supervised learning, and to improve the performance further, the model can slightly outperform the baseline with fewer filters and therefore fewer parameters.
CONVOLUTION-AUGMENTED TRANSFORMER FOR SEMI-SUPERVISED SOUND EVENT DETECTION Technical Report
TLDR
This model employs conformer blocks, which combine the self-attention and depth-wise convolution networks, to efficiently capture the global and local context information of an audio feature sequence.
SOUND EVENT DETECTION IN DOMESTIC ENVIRONMENTS USING DENSE RECURRENT NEURAL NETWORK Technical Report
TLDR
The authors' sound events detection system using a mean-teacher model with convolutional recurrent neural network (CRNN) for DCASE 2020 Task4 achieves 15% improvement on macro-averaged F-score on the development set, as compared to the baseline.
JOINT TRAINING OF GUIDED LEARNING AND MEAN TEACHER MODELS FOR SOUND EVENT DETECTION
TLDR
This paper's proposed model structure includes a feature-level front-end based on convolution neural networks (CNN), followed by both embedding-level and instance-level back-end attention modules, and a set of adaptive median windows for individual sound events is used to smooth the framelevel predictions in post-processing.
Self-training with noisy student model and semi-supervised loss function for dcase 2021 challenge task 4
TLDR
The performance of the proposed SED model is evaluated on the validation set of the DCASE 2021 Challenge Task 4, and several ensemble models that combine five-fold validation models with different hyperparameters of the semi-supervised loss function are finally selected as final models.
CONVOLUTION-AUGMENTED CONFORMER FOR SOUND EVENT DETECTION Technical Report
TLDR
This model employs conformer blocks, which combine the self-attention and depth-wise convolution networks, to efficiently capture the global and local context information of an audio feature sequence, and improves the performance by utilizing a mean teacher semi-supervised learning technique, data augmentation for each sound event class.
...
...

References

SHOWING 1-4 OF 4 REFERENCES
MEAN TEACHER CONVOLUTION SYSTEM FOR DCASE 2018 TASK 4
TLDR
A mean-teacher model with context-gating convolutional neural network (CNN) and recurrent neuralnetwork (RNN) to maximize the use of unlabeled in-domain dataset is proposed.
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
TLDR
The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.
Recurrent neural networks for polyphonic sound event detection in real life recordings
In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single
Environmental sound classification with convolutional neural networks
  • Karol J. Piczak
  • Computer Science
    2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP)
  • 2015
TLDR
The model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches.