• Corpus ID: 2181703

Training Deep Neural Networks on Noisy Labels with Bootstrapping

@article{Reed2015TrainingDN,
  title={Training Deep Neural Networks on Noisy Labels with Bootstrapping},
  author={Scott E. Reed and Honglak Lee and Dragomir Anguelov and Christian Szegedy and D. Erhan and Andrew Rabinovich},
  journal={CoRR},
  year={2015},
  volume={abs/1412.6596}
}
Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting. [] Key Method We consider a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data. In experiments we demonstrate that our approach yields substantial robustness to label noise on several datasets. On MNIST handwritten…

Figures and Tables from this paper

Deep Neural Networks for Corrupted Labels
TLDR
An approach for learning deep networks from datasets corrupted by unknown label noise is described, which append a nonlinear noise model to a standard deep network, which is learned in tandem with the parameters of the network.
Deep Learning is Robust to Massive Label Noise
TLDR
It is shown that deep neural networks are capable of generalizing from training data for which true labels are massively outnumbered by incorrect labels, and that training in this regime requires a significant but manageable increase in dataset size that is related to the factor by which correct labels have been diluted.
Learning from Noisy Labels with Noise Modeling Network
TLDR
The state-of-the-art of training classifiers are extended by modeling noisy and missing labels in multi-label images with a new Noise Modeling Network (NMN) that follows the authors' convolutional neural network (CNN) and integrates with it, forming an end-to-end deep learning system, which can jointly learn the noise distribution and CNN parameters.
Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
TLDR
This paper finds that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets, and adopts the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels.
Learning to Learn From Noisy Labeled Data
TLDR
This work proposes a noise-tolerant training algorithm, where a meta-learning update is performed prior to conventional gradient update, and trains the model such that after one gradient update using each set of synthetic noisy labels, the model does not overfit to the specific noise.
Iterative Learning with Open-set Noisy Labels
TLDR
A novel iterative learning framework for training CNNs on datasets with open-set noisy labels that detects noisy labels and learns deep discriminative features in an iterative fashion and designs a Siamese network to encourage clean labels and noisy labels to be dissimilar.
Learning with noisy labels
TLDR
A lack of statistical understanding of models for dealing with noisy data and attempts so far seem to have been mostly heuristic, there are also many ways to model the noise yet to be explored.
A Spectral Perspective of DNN Robustness to Label Noise
TLDR
This work relates the smoothness regularization that usually exists in conventional training to the attenuation of high frequencies, which mainly character-ize noise, and suggests that one may further improve robustness via spectral normalization.
Towards Robust Learning with Different Label Noise Distributions
TLDR
Experiments in CIFAR-10/100, ImageNet32/64 and WebVision (real-world noise) demonstrate that the proposed label noise Distribution Robust Pseudo-Labeling (DRPL) approach gives substantial improvements over recent state-of-the-art.
Training Classifiers that are Universally Robust to All Label Noise Levels
TLDR
A distillation-based framework that incorporates a new subcategory of PositiveUnlabeled learning, which shall assume that a small subset of any given noisy dataset is known to have correct labels, while the remaining noisy subset is treated as “unlabeled”.
...
...

References

SHOWING 1-10 OF 47 REFERENCES
Learning from Noisy Labels with Deep Neural Networks
TLDR
A novel way of modifying deep learning models so they can be effectively trained on data with high level of label noise is proposed, and it is shown that random images without labels can improve the classification performance.
Training Convolutional Networks with Noisy Labels
TLDR
An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks.
Scalable Object Detection Using Deep Neural Networks
TLDR
This work proposes a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest.
Semi-Supervised Self-Training of Object Detection Models
TLDR
The key contributions of this empirical study are to demonstrate that a model trained in this manner can achieve results comparable to a modeltrained in the traditional manner using a much larger set of fully labeled data, and that a training data selection metric that is defined independently of the detector greatly outperforms a selection metric based on the detection confidence generated by the detector.
Deep Generative Stochastic Networks Trainable by Backprop
TLDR
Theorems that generalize recent work on the probabilistic interpretation of denoising autoencoders are provided and obtain along the way an interesting justification for dependency networks and generalized pseudolikelihood.
ImageNet classification with deep convolutional neural networks
TLDR
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
9 Entropy Regularization
TLDR
The use of entropy regularization is motivated as a means to benefit from unlabeled data in the framework of maximum a posteriori estimation and is able to challenge mixture models and manifold learning in a number of situations.
The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization
TLDR
This work investigates the reasons for the success of sparse coding over VQ by decoupling these phases, allowing us to separate out the contributions of training and encoding in a controlled way and shows not only that it can use fast VQ algorithms for training, but that they can just as well use randomly chosen exemplars from the training set.
Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks
TLDR
This simple and efficient method of semi-supervised learning for deep neural networks is proposed, trained in a supervised fashion with labeled and unlabeled data simultaneously and favors a low-density separation between classes.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TLDR
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
...
...