Faster Meta Update Strategy for Noise-Robust Deep Learning

  title={Faster Meta Update Strategy for Noise-Robust Deep Learning},
  author={Youjiang Xu and Linchao Zhu and Lu Jiang and Yi Yang},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Youjiang Xu, Linchao Zhu, Yi Yang
  • Published 30 April 2021
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
It has been shown that deep neural networks are prone to overfitting on biased training data. Towards addressing this issue, meta-learning employs a meta model for correcting the training bias. Despite the promising performances, super slow training is currently the bottleneck in the meta learning approaches. In this paper, we introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient computation with a faster layer-wise approximation. We… 

Figures and Tables from this paper

Learning to Bootstrap for Combating Label Noise
This paper proposes a more generic learnable loss objective which enables a joint reweighting of instances and labels at once, and dynamically adjusts the per-sample importance weight between the real observed labels and pseudo-labels, where the weights are efficiently determined in a meta process.
Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data
This paper delves into the loss curves and proposes a novel probe-and-allocate training strategy, which achieves state-of-the-art performance on multiple challenging benchmarks and solves the robust deep learning issue of corrupted labels and class imbalance.
Regularizing Generative Adversarial Networks under Limited Data
This work proposes a regularization approach for training robust GAN models on limited data and theoretically shows a connection between the regularized loss and an f-divergence called LeCam-Divergence, which is more robust under limited training data.
Self-supervised and Supervised Joint Training for Resource-rich Machine Translation
A joint training approach to combine self-supervised and supervised learning to optimize NMT models, F2-XEnDec, which achieves substantial improvements over several strong baseline methods and obtains a new state of the art of 46.19 BLEU on English-French when incorporating back translation.
S3: Supervised Self-supervised Learning under Label Noise
This paper addresses the problem of classification in the presence of label noise and more specifically, both close-set and open-set label noise, that is when the true label of a sample may, or may not belong to the set of the given labels.
Dropout can Simulate Exponential Number of Models for Sample Selection Techniques
Not only is it more convenient to use a single model with Dropout, but this approach also combines the natural benefits of Dropout with that of training an exponential number of models, leading to improved results.
PropMix: Hard Sample Filtering and Proportional MixUp for Learning with Noisy Labels
The learning algorithm PropMix is introduced to handle the issues in large noise rate problems and has state-of-the-art results on CIFAR-10/-100, Red Mini-ImageNet, Clothing1M and WebVision, and in severe label noise benchmarks, its results are substantially better than other methods.
ScanMix: Learning from Severe Label Noise via Semantic Clustering and Semi-Supervised Learning
The proposed training algorithm ScanMix, combines semantic clustering with semi-supervised learning (SSL) to improve the feature representations and enable an accurate identification of noisy samples, even in severe label noise scenarios.
Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling
A cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images to self-train a student model to mitigate the ad-verse impact of noisy pseudo masks.
Training Robust Object Detectors From Noisy Category Labels and Imprecise Bounding Boxes
A Meta-Refine-Net is proposed to train object detectors from noisy category labels and imprecise bounding boxes and is model-agnostic and is capable of learning from noisy object detection data with only a few clean examples.


Learning to Reweight Examples for Robust Deep Learning
This work proposes a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions that can be easily implemented on any type of deep network, does not require any additional hyperparameter tuning, and achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available.
Adding Gradient Noise Improves Learning for Very Deep Networks
This paper explores the low-overhead and easy-to-implement optimization technique of adding annealed Gaussian noise to the gradient, which it is found surprisingly effective when training these very deep architectures.
Meta Transition Adaptation for Robust Deep Learning with Noisy Labels
Through the sound guidance of a small set of meta data with clean labels, the noise transition matrix and the classifier parameters can be mutually ameliorated to avoid being trapped by noisy training samples, and without need of any anchor point assumptions.
Training Noise-Robust Deep Neural Networks via Meta-Learning
This work proposes a new loss correction approach, named as Meta Loss Correction (MLC), to directly learn T from data via the meta-learning framework, which is model-agnostic and learns T fromData rather than heuristically approximates it using prior knowledge.
MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels
Experimental results demonstrate that the proposed novel technique of learning another neural network, called MentorNet, to supervise the training of the base deep networks, namely, StudentNet, can significantly improve the generalization performance of deep networks trained on corrupted training data.
Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting
Synthetic and real experiments substantiate the capability of the method for achieving proper weighting functions in class imbalance and noisy label cases, fully complying with the common settings in traditional methods, and more complicated scenarios beyond conventional cases.
Snapshot Distillation: Teacher-Student Optimization in One Generation
Optimizing a deep neural network is a fundamental task in computer vision, yet direct training methods often suffer from over-fitting. Teacher-student optimization aims at providing complementary
How does Disagreement Help Generalization against Label Corruption?
A robust learning paradigm called Co-teaching+, which bridges the "Update by Disagreement" strategy with the original Co-Teaching, which is much superior to many state-of-the-art methods in the robustness of trained models.
Understanding deep learning requires rethinking generalization
These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity.
Co-teaching: Robust training of deep neural networks with extremely noisy labels
Empirical results on noisy versions of MNIST, CIFar-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.