Weighted Ensemble Self-Supervised Learning
@article{Ruan2022WeightedES, title={Weighted Ensemble Self-Supervised Learning}, author={Yangjun Ruan and Saurabh Singh and Warren R. Morningstar and Alexander A. Alemi and Sergey Ioffe and Ian S. Fischer and Joshua V. Dillon}, journal={ArXiv}, year={2022}, volume={abs/2211.09981} }
Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framework that permits data-dependent weighted cross-entropy losses. We re-frain from ensembling the…
Figures and Tables from this paper
References
SHOWING 1-10 OF 75 REFERENCES
Temporal Ensembling for Semi-Supervised Learning
- Computer ScienceICLR
- 2017
Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.
Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?
- Computer ScienceArXiv
- 2022
R E LICv2 is the first unsupervised representation learning method to consistently outperform a standard supervised baseline in a like-for-like comparison across a wide range of ResNet architectures and is comparable to state-of-the-art self-supervised vision transformers.
Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks
- Computer ScienceArXiv
- 2015
It is demonstrated that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss, achieving significantly higher "oracle" accuracies than classical ensembled.
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
- Computer ScienceNIPS
- 2017
The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.
Big Self-Supervised Models are Strong Semi-Supervised Learners
- Computer ScienceNeurIPS
- 2020
The proposed semi-supervised learning algorithm can be summarized in three steps: unsupervised pretraining of a big ResNet model using SimCLRv2 (a modification of SimCLRs), supervised fine-tuning on a few labeled examples, and distillation with unlabeled examples for refining and transferring the task-specific knowledge.
Snapshot Ensembles: Train 1, get M for free
- Computer ScienceICLR
- 2017
This paper proposes a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost by training a single neural network, converging to several local minima along its optimization path and saving the model parameters.
No One Representation to Rule Them All: Overlapping Features of Training Methods
- Computer ScienceICLR
- 2022
A large-scale empirical study of models across hyper-parameters, architectures, frameworks, and datasets finds that model pairs that diverge more in training methodology display categorically different generalization behavior, producing increasingly uncorrelated errors.
Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors
- Computer ScienceICML
- 2020
A rank-1 parameterization of BNNs is proposed, where each weight matrix involves only a distribution on aRank-1 subspace, and the use of mixture approximate posteriors to capture multiple modes is revisited.
BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning
- Computer ScienceICLR
- 2020
BatchEnsemble is proposed, an ensemble method whose computational and memory costs are significantly lower than typical ensembles and can easily scale up to lifelong learning on Split-ImageNet which involves 100 sequential learning tasks.
Self-labelling via simultaneous clustering and representation learning
- Computer ScienceICLR
- 2020
The proposed novel and principled learning formulation is able to self-label visual data so as to train highly competitive image representations without manual labels and yields the first self-supervised AlexNet that outperforms the supervised Pascal VOC detection baseline.