A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels

@article{Ding2018AST,
  title={A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels},
  author={Yifan Ding and Liqiang Wang and Deliang Fan and Boqing Gong},
  journal={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2018},
  pages={1215-1224}
}
The recent success of deep neural networks is powered in part by large-scale well-labeled training data. However, it is a daunting task to laboriously annotate an ImageNet-like dateset. On the contrary, it is fairly convenient, fast, and cheap to collect training images from the Web along with their noisy labels. This signifies the need of alternative approaches to training deep neural networks using such noisy labels. Existing methods tackling this problem either try to identify and correct… 

Figures and Tables from this paper

Data fusing and joint training for learning with noisy labels

This paper proposes a new method for selecting training data accurately and fits a mixture model to the per-sample loss of the raw label and the predicted label, and the mixture model is utilized to dynamically divide the training set into a correctly labeled set, a correctly predicted set, and a wrong set.

Distilling Effective Supervision From Severe Label Noise

This paper presents a holistic framework to train deep neural networks in a way that is highly invulnerable to label noise and achieves excellent performance on large-scale datasets with real-world label noise.

Class-conditional Importance Weighting for Deep Learning with Noisy Labels

This paper extends the existing Contrast to Divide algorithm coupled with DivideMix using a new class-conditional weighted scheme and proposes a loss correction method that relies on dynamic weights computed based on the model training.

DivideMix: Learning with Noisy Labels as Semi-supervised Learning

This work proposes DivideMix, a novel framework for learning with noisy labels by leveraging semi-supervised learning techniques, which models the per-sample loss distribution with a mixture model to dynamically divide the training data into a labeled set with clean samples and an unlabeled set with noisy samples.

Learning from Noisy Labels with Deep Neural Networks: A Survey

A comprehensive review of 62 state-of-the-art robust training methods, all of which are categorized into five groups according to their methodological difference, followed by a systematic comparison of six properties used to evaluate their superiority.

JSMix: a holistic algorithm for learning with label noise

This paper proposes a robust algorithm for learning with label noise that does not require additional clean data and an auxiliary model and experimentally shows that the integration of SSL helps the model divide two subsets more precise and build decision boundaries more explicit.

Deep Self-Learning From Noisy Labels

This work presents a novel deep self-learning framework to train a robust network on the real noisy datasets without extra supervision, which is effective and efficient and outperforms its counterparts in all empirical settings.

Meta joint optimization: a holistic framework for noisy-labeled visual recognition

This work proposes meta-joint optimization (MJO), a novel and holistic framework for learning with noisy labels that can jointly learn DNN parameters and correct noisy labels and demonstrates the advantageous performance of the proposed method compared to state-of-the-art baselines.

Meta joint optimization: a holistic framework for noisy-labeled visual recognition

This work proposes meta-joint optimization (MJO), a novel and holistic framework for learning with noisy labels that can jointly learn DNN parameters and correct noisy labels and demonstrates the advantageous performance of the proposed method compared to state-of-the-art baselines.
...

References

SHOWING 1-10 OF 65 REFERENCES

Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data

This work proposes an end-to-end weakly-supervised deep learning framework which is robust to the label noise in Web images and relies on two unified strategies, random grouping and attention, to effectively reduce the negative impact of noisy web image annotations.

Training Deep Neural Networks on Noisy Labels with Bootstrapping

A generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency is proposed, which considers a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data.

Learning from Noisy Labels with Distillation

This work proposes a unified distillation framework to use “side” information, including a small clean dataset and label relations in knowledge graph, to “hedge the risk” of learning from noisy labels, and proposes a suite of new benchmark datasets to evaluate this task in Sports, Species and Artifacts domains.

Deep Learning is Robust to Massive Label Noise

It is shown that deep neural networks are capable of generalizing from training data for which true labels are massively outnumbered by incorrect labels, and that training in this regime requires a significant but manageable increase in dataset size that is related to the factor by which correct labels have been diluted.

Training Convolutional Networks with Noisy Labels

An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks.

Learning from massive noisy labeled data for image classification

A general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels is introduced and the relationships between images, class labels and label noises are model with a probabilistic graphical model and further integrate it into an end-to-end deep learning system.

Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks

The proposed novel framework for training deep convolutional neural networks from noisy labeled datasets that can be obtained cheaply is applied to the image labeling problem and is shown to be effective in labeling unseen images as well as reducing label noise in training on CIFAR-10 and MS COCO datasets.

Temporal Ensembling for Semi-Supervised Learning

Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.

Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach

It is proved that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise, and it is shown how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and providing an end-to-end framework.

Learning from Noisy Large-Scale Datasets with Minimal Supervision

An approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations and is particularly effective for a large number of classes with wide range of noise in annotations.
...