Corpus ID: 237571954

ARCA23K: An audio dataset for investigating open-set label noise

  title={ARCA23K: An audio dataset for investigating open-set label noise},
  author={Turab Iqbal and Yin Cao and Andrew Bailey and MarkD . Plumbley and Wenwu Wang},
The availability of audio data on sound sharing platforms such as Freesound gives users access to large amounts of annotated audio. Utilising such data for training is becoming increasingly popular, but the problem of label noise that is often prevalent in such datasets requires further investigation. This paper introduces ARCA23K, an Automatically Retrieved and Curated Audio dataset comprised of over 23 000 labelled Freesound clips. Unlike past datasets such as FSDKaggle2018 and FSDnoisy18K… Expand

Figures and Tables from this paper


FSD50K: an Open Dataset of Human-Labeled Sound Events
FSD50K is introduced, an open dataset containing over 51k audio clips totalling over 100h of audio manually labeled using 200 classes drawn from the AudioSet Ontology, to provide an alternative benchmark dataset and thus foster SER research. Expand
Audio Tagging by Cross Filtering Noisy Labels
This article presents a novel framework, named CrossFilter, to combat the noisy labels problem for audio tagging, and achieves state-of-the-art performance and even surpasses the ensemble models on FSDKaggle2018 dataset. Expand
Learning Sound Event Classifiers from Web Audio with Noisy Labels
Experiments suggest that training with large amounts of noisy data can outperform training with smaller amounts of carefully-labeled data, and it is shown that noise-robust loss functions can be effective in improving performance in presence of corrupted labels. Expand
Audio tagging with noisy labels and minimal supervision
This paper presents the task setup, the FSDKaggle2019 dataset prepared for this scientific evaluation, and a baseline system consisting of a convolutional neural network. Expand
Learning With Out-of-Distribution Data for Audio Classification
It is shown that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning, and an instance of labelling error for classification tasks in which the dataset is corrupted with out-of-distribution (OOD) instances is investigated. Expand
Audio Set: An ontology and human-labeled dataset for audio events
The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers. Expand
Vggsound: A Large-Scale Audio-Visual Dataset
The goal is to collect a large-scale audio-visual dataset with low label noise from videos ‘in the wild’ using computer vision techniques and investigates various Convolutional Neural Network architectures and aggregation approaches to establish audio recognition baselines for this new dataset. Expand
Training Convolutional Networks with Noisy Labels
An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks. Expand
Learning Sound Events From Webly Labeled Data
This work introduces webly labeled learning for sound events which aims to remove human supervision altogether from the learning process, and develops a method of obtaining labeled audio data from the web, in which no manual labeling is involved. Expand
Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach
It is proved that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise, and it is shown how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and providing an end-to-end framework. Expand