• Corpus ID: 3653594

Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise

@inproceedings{Hendrycks2018UsingTD,
  title={Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise},
  author={Dan Hendrycks and Mantas Mazeika and Duncan Wilson and Kevin Gimpel},
  booktitle={NeurIPS},
  year={2018}
}
The growing importance of massive datasets with the advent of deep learning makes robustness to label noise a critical property for classifiers to have. Sources of label noise include automatic labeling for large datasets, non-expert labeling, and label corruption by data poisoning adversaries. In the latter case, corruptions may be arbitrarily bad, even so bad that a classifier predicts the wrong labels with high confidence. To protect against such sources of noise, we leverage the fact that a… 

Figures and Tables from this paper

IEG: Robust Neural Network Training to Tackle Severe Label Noise
TLDR
A method to train neural networks in a way that is almost invulnerable to severe label noise by utilizing a tiny trusted set of labels based on three key insights: isolation of noisy labels, Escalation of useful supervision from mislabeled data, and Guidance from small trusted data.
Learning to Bootstrap for Combating Label Noise
TLDR
This paper proposes a more generic learnable loss objective which enables a joint reweighting of instances and labels at once, and dynamically adjusts the per-sample importance weight between the real observed labels and pseudo-labels, where the weights are efficiently determined in a meta process.
FINE Samples for Learning with Noisy Labels
TLDR
A novel detector for filtering label noise that focuses on each data point’s latent representation dynamics and measure the alignment between the latent distribution and each representation using the eigen decomposition of the data gram matrix, providing a robust detector using derivative-free simple methods with theoretical guarantees.
ExpertNet: Adversarial Learning and Recovery Against Noisy Labels
TLDR
This paper proposes a novel framework, ExpertNet, composed of Amateur and Expert, which iteratively learn from each other and can achieve robust classification against a wide range of noise ratios and with as little as 20-50% training data, compared to state-of-the-art deep models that solely focus on distilling the impact of noisy labels.
Label Noise Types and Their Effects on Deep Learning
TLDR
A detailed analysis of the effects of different kinds of label noise on learning is provided, and a generic framework to generate feature-dependent label noise is proposed, which is shown to be the most challenging case for learning.
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
TLDR
This paper establishes the first benchmark of controlled real-world label noise from the web, and conducts the largest study by far into understanding deep neural networks trained on noisy labels across different noise levels, noise types, network architectures, and training settings.
Training Classifiers that are Universally Robust to All Label Noise Levels
TLDR
A distillation-based framework that incorporates a new subcategory of PositiveUnlabeled learning, which shall assume that a small subset of any given noisy dataset is known to have correct labels, while the remaining noisy subset is treated as “unlabeled”.
Robust Temporal Ensembling for Learning with Noisy Labels
TLDR
Robust temporal ensembling (RTE) is presented, which combines robust loss with semi-supervised regularization methods to achieve noiserobust learning and achieves state-of-the-art performance across the CIFar-10, CIFAR-100, ImageNet, WebVision, and Food-101N datasets.
Mitigating Memorization in Sample Selection for Learning with Noisy Labels
TLDR
This study proposes a compelling criteria to penalize dominant-noisy-labeled samples intensively through class-wise penalty labels and obtains suitable penalty labels that have high values if the labels are largely corrupted by some classes.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
Training Deep Neural Networks on Noisy Labels with Bootstrapping
TLDR
A generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency is proposed, which considers a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data.
Learning from Noisy Labels with Distillation
TLDR
This work proposes a unified distillation framework to use “side” information, including a small clean dataset and label relations in knowledge graph, to “hedge the risk” of learning from noisy labels, and proposes a suite of new benchmark datasets to evaluate this task in Sports, Species and Artifacts domains.
Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach
TLDR
It is proved that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise, and it is shown how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and providing an end-to-end framework.
Learning from Noisy Large-Scale Datasets with Minimal Supervision
TLDR
An approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations and is particularly effective for a large number of classes with wide range of noise in annotations.
Training Convolutional Networks with Noisy Labels
TLDR
An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks.
Learning with Noisy Labels
TLDR
The problem of binary classification in the presence of random classification noise is theoretically studied—the learner sees labels that have independently been flipped with some small probability, and methods used in practice such as biased SVM and weighted logistic regression are provably noise-tolerant.
Learning from massive noisy labeled data for image classification
TLDR
A general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels is introduced and the relationships between images, class labels and label noises are model with a probabilistic graphical model and further integrate it into an end-to-end deep learning system.
Learning from Binary Labels with Instance-Dependent Corruption
TLDR
It is proved that for instance-dependent noise, any algorithm that is consistent for classification on the noisy distribution is also consistent on the clean distribution, and that for a broad class of instance- and label- dependent noise, a similar consistency result holds for the area under the ROC curve.
Class Noise vs. Attribute Noise: A Quantitative Study
TLDR
A systematic evaluation on the effect of noise in machine learning separates noise into two categories: class noise and attribute noise, and investigates the relationship between attribute noise and classification accuracy, the impact of noise at different attributes, and possible solutions in handling attribute noise.
Certified Defenses for Data Poisoning Attacks
TLDR
This work addresses the worst-case loss of a defense in the face of a determined attacker by constructing approximate upper bounds on the loss across a broad family of attacks, for defenders that first perform outlier removal followed by empirical risk minimization.
...
1
2
3
4
...