Learning Sound Event Classifiers from Web Audio with Noisy Labels

@article{Fonseca2019LearningSE,
  title={Learning Sound Event Classifiers from Web Audio with Noisy Labels},
  author={Eduardo Fonseca and Manoj Plakal and Daniel P. W. Ellis and Frederic Font and Xavier Favory and Xavier Serra},
  journal={ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2019},
  pages={21-25}
}
  • Eduardo Fonseca, M. Plakal, X. Serra
  • Published 4 January 2019
  • Computer Science
  • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
As sound event classification moves towards larger datasets, issues of label noise become inevitable. Web sites can supply large volumes of user-contributed audio and metadata, but inferring labels from this metadata introduces errors due to unreliable inputs, and limitations in the mapping. There is, however, little research into the impact of these errors. To foster the investigation of label noise in sound event classification we present FSDnoisy18k, a dataset containing 42.5 hours of audio… 

Figures and Tables from this paper

Model-Agnostic Approaches To Handling Noisy Labels When Training Sound Event Classifiers
TLDR
This work evaluates simple and efficient model-agnostic approaches to handling noisy labels when training sound event classifiers, namely label smoothing regularization, mixup and noise-robust loss functions, which can be easily incorporated to existing deep learning pipelines without need for network modifications or extra resources.
Audio Tagging by Cross Filtering Noisy Labels
TLDR
This article presents a novel framework, named CrossFilter, to combat the noisy labels problem for audio tagging, and achieves state-of-the-art performance and even surpasses the ensemble models on FSDKaggle2018 dataset.
Audio Tagging using Linear Noise Modelling Layer
TLDR
Results show that modelling the noise distribution improves the accuracy of the baseline network in a similar capacity to the soft bootstrapping loss.
Detection and Classification of Acoustic Scenes and Events 2019 Challenge MULTI-LABEL AUDIO TAGGING WITH NOISY LABELS AND VARIABLE LENGTH Technical Report
TLDR
This paper proposes a data generation method named Dominate Mixup which can restrain the impact of incorrect label during back propagation and it’s suitable for multi-class classification problem.
The Benefit of Temporally-Strong Labels in Audio Event Classification
  • Shawn Hershey, D. Ellis, M. Plakal
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
TLDR
It is shown that fine-tuning with a mix of weak- and strongly-labeled data can substantially improve classifier performance, even when evaluated using only the original weak labels.
Audio tagging with noisy labels and minimal supervision
TLDR
This paper presents the task setup, the FSDKaggle2019 dataset prepared for this scientific evaluation, and a baseline system consisting of a convolutional neural network.
The Impact of Missing Labels and Overlapping Sound Events on Multi-label Multi-instance Learning for Sound Event Classification
TLDR
This paper investigates two state-of-theart methodologies that allow this type of learning, low-resolution multi-label non-negative matrix deconvolution (LRM-NMD) and CNN and shows good robustness to missing labels.
Supervised Classifiers for Audio Impairments with Noisy Labels
TLDR
It is demonstrated that CNN can generalize better on the training data with a large number of noisy labels and gives remarkably higher test performance.
ARCA23K: An audio dataset for investigating open-set label noise
TLDR
It is shown that the majority of labelling errors in ARCA23K are due to out-of-vocabulary audio clips, and this type of label noise is referred to as open-set label noise.
SeCoST:: Sequential Co-Supervision for Large Scale Weakly Labeled Audio Event Detection
  • Anurag Kumar, V. Ithapu
  • Computer Science
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
TLDR
A new framework for designing learning models with weak supervision by bridging ideas from sequential learning and knowledge distillation is proposed, referred to as SeCoST (pronounced Sequest) — Sequential Co-supervision for training generations of Students.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
A Closer Look at Weak Label Learning for Audio Events
TLDR
This work describes a CNN based approach for weakly supervised training of audio events and describes important characteristics, which naturally arise inweakly supervised learning of sound events, and shows how these aspects of weak labels affect the generalization of models.
DCASE 2018 task 2: iterative training, label smoothing, and background noise normalization for audio event tagging
TLDR
This paper describes an approach from the submissions for DCASE 2018 Task 2: general-purpose audio tagging of Freesound content with AudioSet labels, and proposes to use pseudolabel for automatic label verification and label smoothing to reduce the over-fitting.
Audio Set: An ontology and human-labeled dataset for audio events
TLDR
The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Data-efficient weakly supervised learning for low-resource audio event detection using deep learning
TLDR
A data-efficient training of a stacked convolutional and recurrent neural network is proposed in a multi instance learning setting for which a new loss function is introduced that leads to improved training compared to the usual approaches for weakly supervised learning.
Iterative Learning with Open-set Noisy Labels
TLDR
A novel iterative learning framework for training CNNs on datasets with open-set noisy labels that detects noisy labels and learns deep discriminative features in an iterative fashion and designs a Siamese network to encourage clean labels and noisy labels to be dissimilar.
Training general-purpose audio tagging networks with noisy labels and iterative self-verification
This paper describes our submission to the first Freesound generalpurpose audio tagging challenge carried out within the DCASE 2018 challenge. Our proposal is based on a fully convolutional neural
Joint Optimization Framework for Learning with Noisy Labels
TLDR
This work proposes a joint optimization framework of learning DNN parameters and estimating true labels that can correct labels during training by alternating update of network parameters and labels.
Semi-supervised learning helps in sound event classification
  • Zixing Zhang, Björn Schuller
  • Computer Science
    2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2012
TLDR
Adding unlabelled sound event data to the training set based on sufficient classifier confidence level after its automatic labelling level can significantly enhance classification performance, and combined with optimal re-sampling of originally labelled instances and iteratively learning in semi-supervised manner can reach approximately half the one achieved by using the originally manually labelled data.
Learning from Noisy Large-Scale Datasets with Minimal Supervision
TLDR
An approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations and is particularly effective for a large number of classes with wide range of noise in annotations.
Training Deep Neural Networks on Noisy Labels with Bootstrapping
TLDR
A generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency is proposed, which considers a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data.
...
1
2
3
...