CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise

  title={CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise},
  author={Kuang-Huei Lee and Xiaodong He and Lei Zhang and Linjun Yang},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
In this paper, we study the problem of learning image classification models with label noise. Existing approaches depending on human supervision are generally not scalable as manually identifying correct or incorrect labels is time-consuming, whereas approaches not relying on human supervision are scalable but less effective. To reduce the amount of human supervision for label noise cleaning, we introduce CleanNet, a joint neural embedding network, which only requires a fraction of the classes… 
Towards Scalable Image Classifier Learning with Noisy Labels via Domain Adaptation
This chapter introduces a transfer learning set-up for tackling noisy labels, and reviews CleanNet, the first neural network model that practically implements this set- up, and explores future directions of this topic.
Weakly Supervised Image Classification Through Noise Regularization
Experimental results show that the proposed approach outperforms the state-of-the-art methods, and generalizes well to both single-label and multi-label scenarios.
Distilling Effective Supervision From Severe Label Noise
This paper presents a holistic framework to train deep neural networks in a way that is highly invulnerable to label noise and achieves excellent performance on large-scale datasets with real-world label noise.
Learning to Learn From Noisy Labeled Data
This work proposes a noise-tolerant training algorithm, where a meta-learning update is performed prior to conventional gradient update, and trains the model such that after one gradient update using each set of synthetic noisy labels, the model does not overfit to the specific noise.
Training-ValueNet: Data Driven Label Noise Cleaning on Weakly-Supervised Web Images
  • L. Smyth, D. Kangin, N. Pugeault
  • Computer Science
    2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)
  • 2019
It is demonstrated that by simply discarding images with a negative training-value, Training-ValueNet is able to significantly improve classification performance on a held-out test set, outperforming the state of the art in outlier detection by a large margin.
Weakly Supervised Learning with Side Information for Noisy Labeled Images
An efficient weakly supervised learning by using a Side Information Network (SINet), which aims to effectively carry out a large scale classification with severely noisy labels, has won the first place in the classification task on WebVision Challenge 2019, and outperformed other competitors by a large margin.
Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey
PENCIL: Deep Learning with Noisy Labels
PENCIL outperforms previous state-of-the-art methods by large margins on both synthetic and real-world datasets with different noise types and noise rates and is also effective in multi-label classification tasks through adding a simple attention structure on backbone networks.
Label Distribution for Learning with Noisy Labels
A novel method named Label Distribution based Confidence Estimation (LDCE) is proposed, which estimates the confidence of the observed labels based on label distribution and shows the boundary between clean labels and noisy labels becomes clear according to confidence scores.
Robust Temporal Ensembling for Learning with Noisy Labels
Robust temporal ensembling (RTE) is presented, which combines robust loss with semi-supervised regularization methods to achieve noiserobust learning and achieves state-of-the-art performance across the CIFar-10, CIFAR-100, ImageNet, WebVision, and Food-101N datasets.


Deep Learning is Robust to Massive Label Noise
It is shown that deep neural networks are capable of generalizing from training data for which true labels are massively outnumbered by incorrect labels, and that training in this regime requires a significant but manageable increase in dataset size that is related to the factor by which correct labels have been diluted.
Learning from massive noisy labeled data for image classification
A general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels is introduced and the relationships between images, class labels and label noises are model with a probabilistic graphical model and further integrate it into an end-to-end deep learning system.
Training Convolutional Networks with Noisy Labels
An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks.
Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data
This work proposes an end-to-end weakly-supervised deep learning framework which is robust to the label noise in Web images and relies on two unified strategies, random grouping and attention, to effectively reduce the negative impact of noisy web image annotations.
Auxiliary Image Regularization for Deep CNNs with Noisy Labels
An auxiliary image regularization technique is proposed, optimized by the stochastic Alternating Direction Method of Multipliers (ADMM) algorithm, that automatically exploits the mutual context information among training images and encourages the model to select reliable images to robustify the learning process.
Zero-Shot Learning Through Cross-Modal Transfer
This work introduces a model that can recognize objects in images even if no training data is available for the object class, and uses novelty detection methods to differentiate unseen classes from seen classes.
Learning from Noisy Large-Scale Datasets with Minimal Supervision
An approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations and is particularly effective for a large number of classes with wide range of noise in annotations.
Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach
It is proved that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise, and it is shown how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and providing an end-to-end framework.
Learning from Noisy Labels with Distillation
This work proposes a unified distillation framework to use “side” information, including a small clean dataset and label relations in knowledge graph, to “hedge the risk” of learning from noisy labels, and proposes a suite of new benchmark datasets to evaluate this task in Sports, Species and Artifacts domains.
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
This work proposes to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop, and constructs a new image dataset, LSUN, which contains around one million labeled images for each of 10 scene categories and 20 object categories.