• Corpus ID: 195873898

Unsupervised Data Augmentation for Consistency Training

@article{Xie2020UnsupervisedDA,
  title={Unsupervised Data Augmentation for Consistency Training},
  author={Qizhe Xie and Zihang Dai and Eduard H. Hovy and Minh-Thang Luong and Quoc V. Le},
  journal={arXiv: Learning},
  year={2020}
}
Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role… 

Figures and Tables from this paper

Efficient Semi-supervised Consistency Training for Natural Language Understanding
TLDR
The results demonstrate that DC models trained with CT methods and dropout-based augmentation on only 0.1% of labeled data with the remainder as unlabeled can achieve a top-1 relative accuracy reduction of 12.25%, which paves the way for applications of using large scale unlabeling data for semi-supervised learning in production NLU systems.
An Exploration of Consistency Learning with Data Augmentation
TLDR
This work studies the addition of a Consistency Loss between representations of original and augmented data points, and is able to dramatically reduce the proposed Distributional Distance metric with the Consistsency Loss.
Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning
TLDR
This work introduces Hybrid Consistency Training to jointly leverage interpolation consistency, including interpolating hidden features, that imposes linear behavior locally and data augmentation consistency that learns robust embeddings against sample variations.
Learning with Neighbor Consistency for Noisy Labels
TLDR
This work presents a method for learning from noisy labels that leverages similarities between training examples in feature space, encouraging the prediction of each example to be similar to its nearest neigh-bours.
AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data
TLDR
AuxMix is proposed, an algorithm that leverages self-supervised learning tasks to learn generic features in order to mask auxiliary data that are not semantically similar to the labeled set and to regularize learning by maximizing the predicted entropy for dissimilar auxiliary samples.
Active Self-Semi-Supervised Learning for Few Labeled Samples Fast Training
TLDR
This paper proposes an active self-semi-supervised training framework that bootstraps semi- supervised models with good prior pseudo-labels, where the priors are obtained by label propagation over self- Supervised features.
On The Consistency Training for Open-Set Semi-Supervised Learning
TLDR
This work thoroughly study how OOD samples affect DNN training in both low and high-dimensional spaces, where two fundamental SSL methods are considered: Pseudo Labeling (PL) and Data Augmentation based Consistency Training (DACT).
Pseudo-Representation Labeling Semi-Supervised Learning
TLDR
The pseudo-representation labeling is a simple and flexible framework that utilizes pseudo-labeling techniques to iteratively label a small amount of unlabeled data and use them as training data and outperforms the current state-of-the-art semi-supervised learning methods in industrial types of classification problems such as the WM-811K wafer map and the MIT-BIH Arrhythmia dataset.
Augmentation-induced Consistency Regularization for Classification
TLDR
This paper proposes a consistency regularization framework based on data augmentation, called CR-Aug, which forces the output distributions of different sub models generated byData augmentation to be consistent with each other, which outperforms baseline methods by a large margin.
Augmentation Strategies for Learning with Noisy Labels
TLDR
This paper proposes and examines multiple augmentation strategies for algorithms tackling the "learning with noisy labels" problem and improves accuracy on the CIFAR-10 benchmark at 90% symmetric noise by more than 15% in absolute accuracy, and improves performance on the Clothing1M dataset.
...
...

References

SHOWING 1-10 OF 83 REFERENCES
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
TLDR
This work creates a unified reimplemention and evaluation platform of various widely-used SSL techniques and finds that the performance of simple baselines which do not use unlabeled data is often underreported, that SSL methods differ in sensitivity to the amount of labeled and unlabeling data, and that performance can degrade substantially when the unlabelED dataset contains out-of-class examples.
Temporal Ensembling for Semi-Supervised Learning
TLDR
Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.
RandAugment: Practical data augmentation with no separate search
TLDR
RandAugment can be used uniformly across different tasks and datasets and works out of the box, matching or surpassing all previous learned augmentation approaches on CIFAR-10, CIFar-100, SVHN, and ImageNet.
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
TLDR
An unsupervised loss function is proposed that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network.
Unifying semi-supervised and robust learning by mixup
TLDR
It is suggested that semi-supervised learning can outperform robust learning with noisy labels and a training strategy for mixing mixup techniques to learn from bi-quality data effectively is proposed.
There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average
TLDR
It is shown that SGD struggles to converge on the consistency loss and continues to make large steps that lead to changes in predictions on the test data, and proposes to train consistency-based methods with Stochastic Weight Averaging (SWA), a recent approach which averages weights along the trajectory of SGD with a modified learning rate schedule.
Smooth Neighbors on Teacher Graphs for Semi-Supervised Learning
TLDR
A novel method, called Smooth Neighbors on Teacher Graphs (SNTG), which serves as a similarity measure with respect to which the representations of "similar" neighboring points are learned to be smooth on the low-dimensional manifold and achieves state-of-the-art results on semi-supervised learning benchmarks.
Semi-Supervised Sequence Modeling with Cross-View Training
TLDR
Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data, is proposed and evaluated, achieving state-of-the-art results.
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
TLDR
The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.
Are Labels Required for Improving Adversarial Robustness?
TLDR
Theoretically, it is shown that in a simple statistical setting, the sample complexity for learning an adversarially robust model from unlabeled data matches the fully supervised case up to constant factors, and this finding extends as well to the more realistic case where unlabeling data is also uncurated, therefore opening a new avenue for improving adversarial training.
...
...