Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach

  title={Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach},
  author={Giorgio Patrini and Alessandro Rozza and Aditya Krishna Menon and Richard Nock and Lizhen Qu},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Giorgio Patrini, A. Rozza, Lizhen Qu
  • Published 13 September 2016
  • Computer Science
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise. We propose two procedures for loss correction that are agnostic to both application domain and network architecture. They simply amount to at most a matrix inversion and multiplication, provided that we know the probability of each class being corrupted into another. We further show how one can estimate these probabilities, adapting a recent technique… 

Figures and Tables from this paper

Training Noise-Robust Deep Neural Networks via Meta-Learning
This work proposes a new loss correction approach, named as Meta Loss Correction (MLC), to directly learn T from data via the meta-learning framework, which is model-agnostic and learns T fromData rather than heuristically approximates it using prior knowledge.
Making Deep Neural Networks Robust to Label Noise: A Reweighting Loss and Data Filtration
A robust loss function by reweighting the standard Cross-Entropy loss is proposed for obtaining more robust DNNs under label noise and a framework to jointly optimize model parameters and filtering noisy data during training is designed.
Learning from Noisy Labels with Complementary Loss Functions
Experimental results on benchmark classi-cation datasets indicate that the proposed method helps achieve robust and sufficient deep neural network training simultaneously, and can be regarded as the supplement.
Deep Neural Networks for Corrupted Labels
An approach for learning deep networks from datasets corrupted by unknown label noise is described, which append a nonlinear noise model to a standard deep network, which is learned in tandem with the parameters of the network.
Unsupervised label noise modeling and loss correction
A suitable two-component mixture model is suggested as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled and correct the loss by relying on the network prediction.
Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels
A theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE are presented and can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios.
A Spectral Perspective of DNN Robustness to Label Noise
This work relates the smoothness regularization that usually exists in conventional training to the attenuation of high frequencies, which mainly character-ize noise, and suggests that one may further improve robustness via spectral normalization.
Mitigating Memorization of Noisy Labels via Regularization between Representations
This paper decouple DNNs into an encoder followed by a linear classifier and propose to restrict the function space of a DNN by a representation regularizer, which requires the distance between two self-supervised features to be positively related to thedistance between the corresponding two supervised model outputs.
Pumpout: A Meta Approach for Robustly Training Deep Neural Networks with Noisy Labels
A meta algorithm called Pumpout is proposed to overcome the problem of memorizing noisy labels by using scaled stochastic gradient ascent, which actively squeezes out the negative effects of noisy labels from the training model, instead of passively forgetting these effects.
Synergistic Network Learning and Label Correction for Noise-robust Image Classification
A robust label correction framework combining the ideas of small loss selection and noise correction, which learns network parameters and reassigns ground truth labels iteratively, is proposed.


Training Deep Neural Networks on Noisy Labels with Bootstrapping
A generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency is proposed, which considers a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Training Convolutional Networks with Noisy Labels
An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Learning from massive noisy labeled data for image classification
A general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels is introduced and the relationships between images, class labels and label noises are model with a probabilistic graphical model and further integrate it into an end-to-end deep learning system.
Deep Networks with Stochastic Depth
Stochastic depth is proposed, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time and reduces training time substantially and improves the test error significantly on almost all data sets that were used for evaluation.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Identity Mappings in Deep Residual Networks
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation.
Dropout: a simple way to prevent neural networks from overfitting
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Loss factorization, weakly supervised learning and label noise robustness
We prove that the empirical risk of most well-known loss functions factors into a linear term aggregating all labels with a term that is label free, and can further be expressed by sums of the loss.