• Corpus ID: 222209080

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

@article{Wei2021TheoreticalAO,
  title={Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data},
  author={Colin Wei and Kendrick Shen and Yining Chen and Tengyu Ma},
  journal={ArXiv},
  year={2021},
  volume={abs/2010.03622}
}
Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks. However, the current theoretical understanding of self-training only applies to linear models. This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning. At the core of our analysis is a… 

Figures and Tables from this paper

Cycle Self-Training for Domain Adaptation
TLDR
This paper proposes Cycle Self-Training (CST), a principled self- training algorithm that explicitly enforces pseudo-labels to generalize across domains, and analyzes CST theoretically under realistic assumptions, and provides hard cases where CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail.
Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss
TLDR
This work proposes a loss that performs spectral decomposition on the population augmentation graph and can be succinctly written as a contrastive learning objective on neural net representations, leading to features with provable accuracy guarantees under linear probe evaluation.
Streaming Self-Training via Domain-Agnostic Unlabeled Images
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models such that a non-expert user can define a new task depending on their needs via a
Self-training Converts Weak Learners to Strong Learners in Mixture Models
TLDR
The results imply that mixture models can be learned to within ε of the Bayes-optimal accuracy using at most O ( d ) labeled examples and ˜O ( d/ε 2 ) unlabeled examples by way of a semi-supervised self-training algorithm.
Self-Tuning for Data-Efficient Deep Learning
TLDR
SelfTuning is presented to enable data-efficient deep learning by unifying the exploration of labeled and unlabeled data and the transfer of a pre-trained model, as well as a Pseudo Group Contrast (PGC) mechanism to mitigate the reliance on pseudo-labels and boost the tolerance to false labels.
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
TLDR
It is proved that contrastive learning using ReLU networks provably learns the desired sparse features if proper augmentations are adopted, and an underlying principle called feature decoupling is presented to explain the effects of augmentations.
Self-Supervised Learning of Graph Neural Networks: A Unified Review
TLDR
A unified review of different ways of training GNNs using SSL methods into contrastive and predictive models is provided, which sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation
TLDR
A general framework called “generate, annotate, and learn (GAL)” is presented that uses unconditional generative models to synthesize in-domain unlabeled data, helping advance SSL and KD on different tasks.
A Theory of Label Propagation for Subpopulation Shift
TLDR
This work proposes a provably effective framework for domain adaptation based on label propagation based on a simple but realistic expansion assumption, and adapt consistency-based semi-supervised learning methods to domain adaptation settings and gain significant improvements.
Deep Co-Training with Task Decomposition for Semi-Supervised Domain Adaptation
TLDR
It is argued that the labeled target data needs to be distinguished for effective SSDA, and proposed to explicitly decompose the SSDA task into two sub-tasks: a semi-supervised learning (SSL) task in the target domain and an unsupervised domain adaptation (UDA) task across domains.
...
...

References

SHOWING 1-10 OF 86 REFERENCES
Statistical and Algorithmic Insights for Semi-supervised Learning with Self-training
TLDR
This work establishes a connection between self-training based semi-supervision and the more general problem of learning with heterogenous data and weak supervision and shows how a purely unsupervised notion of generalization based on self- training based clustering can be formalized based on cluster margin.
Label Propagation for Deep Semi-Supervised Learning
TLDR
This work employs a transductive label propagation method that is based on the manifold assumption to make predictions on the entire dataset and use these predictions to generate pseudo-labels for the unlabeled data and train a deep neural network.
Co-Training and Expansion: Towards Bridging Theory and Practice
TLDR
A much weaker "expansion" assumption on the underlying data distribution is proposed, that is proved to be sufficient for iterative co-training to succeed given appropriately strong PAC-learning algorithms on each feature set, and that to some extent is necessary as well.
Unsupervised Data Augmentation for Consistency Training
TLDR
A new perspective on how to effectively noise unlabeled examples is presented and it is argued that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.
Confidence Regularized Self-Training
TLDR
A confidence regularized self-training (CRST) framework, formulated as regularizedSelf-training, that treats pseudo-labels as continuous latent variables jointly optimized via alternating optimization and proposes two types of confidence regularization: label regularization (LR) and modelRegularization (MR).
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
TLDR
This work introduces Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning that performs on par or better than the current state of the art on both transfer and semi- supervised benchmarks.
Unlabeled Data Improves Adversarial Robustness
TLDR
It is proved that unlabeled data bridges the complexity gap between standard and robust classification: a simple semisupervised learning procedure (self-training) achieves high robust accuracy using the same number of labels required for achieving high standard accuracy.
Temporal Ensembling for Semi-Supervised Learning
TLDR
Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.
Asymmetric Tri-training for Unsupervised Domain Adaptation
TLDR
This work proposes the use of an asymmetric tri-training method for unsupervised domain adaptation, where two networks are used to label unlabeled target samples, and one network is trained by the pseudo-labeled samples to obtain target-discriminative representations.
Semi-supervised Learning by Entropy Minimization
TLDR
This framework, which motivates minimum entropy regularization, enables to incorporate unlabeled data in the standard supervised learning, and includes other approaches to the semi-supervised problem as particular or limiting cases.
...
...