• Corpus ID: 235376822

AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation

@article{Berthelot2021AdaMatchAU,
  title={AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation},
  author={David Berthelot and Rebecca Roelofs and Kihyuk Sohn and Nicholas Carlini and Alexey Kurakin},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.04732}
}
We extend semi-supervised learning to the problem of domain adaptation to learn significantly higher-accuracy models that train on one data distribution and test on a different one. With the goal of generality, we introduce AdaMatch, a method that unifies the tasks of unsupervised domain adaptation (UDA), semi-supervised learning (SSL), and semi-supervised domain adaptation (SSDA). In an extensive experimental study, we compare its behavior with respective state-of-the-art techniques from SSL… 
Pick up the PACE: Fast and Simple Domain Adaptation via Ensemble Pseudo-Labeling
TLDR
A fast and simple DA method consisting of three stages: domain alignment by covariance matching, pseudo-labeling, and ensembling that exceeds previous state-of-the-art methods on most benchmark adaptation tasks without training a neural network.
How Does Contrastive Pre-training Connect Disparate Domains?
TLDR
A new measure of connectivity —the relative connection strengths between same and different classes across domains— is proposed that governs the success of contrastive pre-training for domain adaptation in a simple example and strongly correlates with the results on benchmark datasets.
Not All Labels Are Equal: Rationalizing The Labeling Costs for Training Object Detection
TLDR
This work proposes a unified framework for active learning, that considers both the uncertainty and the robustness of the detector, ensuring that the network performs well in all classes, and leverages auto-labeling to suppress a potential distribution drift while boosting the performance of the model.
Holistic Semi-Supervised Approaches for EEG Representation Learning
TLDR
Three state-of-the-art holistic semi-supervised approaches, namely MixMatch, Fix-Match, and AdaMatch are adapted, as well as five classical semi- supervised methods for EEG learning, which show strong results even when only 1 labeled sample is used per class.
Extending the WILDS Benchmark for Unsupervised Adaptation
TLDR
The Wilds 2.0 update is presented, which extends 8 of the 10 datasets in the Wilds benchmark of distribution shifts to include curated unlabeled data that would be realistically obtainable in deployment, and systematically benchmark state-of-the-art methods that leverage unlabeling data, including domain-invariant, self-training, and self-supervised methods.
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning
TLDR
This work proposes FreeMatch to be used and introduces a self-adaptive class fairness regularization penalty that encourages the model to produce diverse predictions during the early stages of training and indicates the superiority of FreeMatch especially when the labeled data are extremely rare.
Self-Supervised Contrastive Pre-Training For Time Series via Time-Frequency Consistency
TLDR
A decomposable pre-training model, where the self-supervised signal is provided by the distance between time and frequency components, each individually trained by contrastive estimation is proposed, motivated by TF-C.
Sample Efficiency of Data Augmentation Consistency Regularization
TLDR
A simple and novel analysis for linear regression with label invariant augmentations is presented, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM).
Ensembles and Cocktails: Robust Finetuning for Natural Language Generation
TLDR
This work presents methods to combine the benefits of full and lightweight finetuning, achieving strong performance both ID and OOD, and provides some explanatory theory in a multiclass logistic regression setting with a large number of classes.
ProxyMix: Proxy-based Mixup Training with Label Refinery for Source-Free Domain Adaptation
TLDR
This work proposes an effective method named Proxy-based Mixup training with label refinery (ProxyMix), which defines the weights of the classifier as the class prototypes and constructs a class-balanced proxy source domain by the nearest neighbors of the prototypes to bridge the unseen source domain and the target domain.
...
...

References

SHOWING 1-10 OF 41 REFERENCES
Temporal Ensembling for Semi-Supervised Learning
TLDR
Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.
Semi-supervised Domain Adaptation with Subspace Learning for visual recognition
TLDR
A novel domain adaptation framework, named Semi-supervised Domain Adaptation with Subspace Learning (SDASL), which jointly explores invariant low-dimensional structures across domains to correct data distribution mismatch and leverages available unlabeled target examples to exploit the underlying intrinsic information in the target domain.
Unsupervised Data Augmentation for Consistency Training
TLDR
A new perspective on how to effectively noise unlabeled examples is presented and it is argued that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.
Fast Generalized Distillation for Semi-Supervised Domain Adaptation
TLDR
It is shown that without accessing the source data, GDSDA can effectively utilize the unlabeled data to transfer the knowledge from the source models to efficiently solve the SDA problem.
Frustratingly Easy Semi-Supervised Domain Adaptation
TLDR
This work builds on the notion of augmented space and harnesses unlabeled data in target domain to ameliorate the transfer of information from source to target and can be applied as a pre-processing step to any supervised learner.
Semi-supervised Domain Adaptation with Instance Constraints
TLDR
It is shown that imposing smoothness constraints on the classifier scores over the unlabeled data can lead to improved adaptation results, and this work proposes techniques that build on existing domain adaptation methods by explicitly modeling these relationships, and demonstrates empirically that they improve recognition accuracy.
Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
TLDR
This generative adversarial network (GAN)-based method adapts source-domain images to appear as if drawn from the target domain, and outperforms the state-of-the-art on a number of unsupervised domain adaptation scenarios by large margins.
Self-ensembling for visual domain adaptation
TLDR
In small image benchmarks, the use of self-ensembling for visual domain adaptation problems not only outperforms prior art, but can also achieve accuracy that is close to that of a classifier trained in a supervised fashion.
Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild
TLDR
This work uses 3D geometry and image synthesis based on a generalized appearance flow to preserve identity across pose transformations, while using an attribute-conditioned CycleGAN to translate a single source into multiple target images that differ in lower-level properties such as lighting.
Generalizing to unseen domains via distribution matching
TLDR
This work focuses on domain generalization: a formalization where the data generating process at test time may yield samples from never-before-seen domains (distributions), and relies on a simple lemma to derive a generalization bound for this setting.
...
...