# Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

@article{Jean2018SemisupervisedDK, title={Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance}, author={Neal Jean and Sang Michael Xie and Stefano Ermon}, journal={ArXiv}, year={2018}, volume={abs/1805.10407} }

Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning (SSDKL), a semi-supervised regression model based on minimizing predictive variance in the posterior regularization framework. SSDKL combines the hierarchical representation learning of neural networks with the probabilistic modeling capabilities of Gaussian…

## 56 Citations

### Metric-Based Semi-Supervised Regression

- Computer ScienceIEEE Access
- 2020

The experimental results indicate that the proposed method to cope with semi-supervised regression problems achieves promising results and could capture the trend of a non-linear function and normally predict well even though this dataset comprises extreme outliers.

### Deep Low-Density Separation for Semi-supervised Classification

- Computer ScienceICCS
- 2020

A novel hybrid method that applies low-density separation to the embedded features of neural network-based embeddings and effectively classifies thousands of unlabeled users from a relatively small number of hand-classified examples is introduced.

### Deep kernels with probabilistic embeddings for small-data learning

- Computer ScienceUAI
- 2021

The proposed approach maps high-dimensional data to a probability distribution in a low dimensional subspace and then computes a kernel between these distributions to capture similarity and derive a functional gradient descent procedure for training the model.

### FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering

- Computer Science2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2022

This work proposes to leverage matrix Fisher distribution to build a probabilistic model of rotation and devise a matrix Fisher-based regressor for jointly predicting rotation along with its prediction uncertainty, and proposes to use the entropy of the predicted distribution as a confidence measure, which enables us to perform pseudo label filtering for rotation regression.

### Lautum Regularization for Semi-Supervised Transfer Learning

- Computer Science2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)
- 2019

The theory suggests that one may improve the transferability of a deep neural network by imposing a Lautum information based regularization that relates the network weights to the target data.

### Few-shot learning for spatial regression via neural embedding-based Gaussian processes

- Computer ScienceMachine Learning
- 2021

The proposed few-shot learning method for spatial regression achieves better predictive performance than existing meta-learning methods using spatial datasets and can be trained efficiently and effectively so that the test predictive performance improves when adapted to newly given small data.

### Deep Probabilistic Kernels for Sample-Efficient Learning

- Computer ScienceArXiv
- 2019

Deep Probabilistic kernels are proposed which use a probabilistic neural network to map high-dimensional data to a probability distribution in a low dimensional subspace, and leverage the rich work on kernels between distributions to capture the similarity between these distributions.

### A Simple yet Effective Baseline for Robust Deep Learning with Noisy Labels

- Computer ScienceArXiv
- 2019

This work proposes a simple but effective baseline that is robust to noisy labels, even with severe noise, and involves a variance regularization term that implicitly penalizes the Jacobian norm of the neural network on the whole training set (including the noisy-labeled data), which encourages generalization and prevents overfitting to the corrupted labels.

### NP-Match: When Neural Processes meet Semi-Supervised Learning

- Computer ScienceICML
- 2022

NPs are adjusted to the semi-supervised image classiﬁcation task, resulting in a new method named NP-Match, which outperforms state-of-the-art (SOTA) results or achieves competitive results on them, which shows the effectiveness of NP- match and its potential for SSL.

### USB: A Unified Semi-supervised Learning Benchmark for Classification

- Computer Science
- 2022

A Uniﬁed SSL Benchmark (USB) is constructed by select-ing 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which to systematically evaluate the dominant SSL methods, and open-source a modular and extensible codebase for fair evaluation.

## References

SHOWING 1-10 OF 46 REFERENCES

### Semi-supervised Learning by Entropy Minimization

- Computer ScienceCAP
- 2004

This framework, which motivates minimum entropy regularization, enables to incorporate unlabeled data in the standard supervised learning, and includes other approaches to the semi-supervised problem as particular or limiting cases.

### Semi-Supervised Regression with Co-Training

- Computer ScienceIJCAI
- 2005

Experiments show that COREG can effectively exploit unlabeled data to improve regression estimates and is proposed as a co-training style semi-supervised regression algorithm.

### Stochastic Variational Deep Kernel Learning

- Computer ScienceNIPS
- 2016

An efficient form of stochastic variational inference is derived which leverages local kernel interpolation, inducing points, and structure exploiting algebra within this framework to enable classification, multi-task learning, additive covariance structures, and Stochastic gradient training.

### Realistic Evaluation of Semi-Supervised Learning Algorithms

- Computer ScienceICLR
- 2018

This work creates a unified reimplemention and evaluation platform of various widelyused SSL techniques and finds that the performance of simple baselines which do not use unlabeled data is often underreported, that SSL methods differ in sensitivity to the amount of labeled and unlabeling data, and that performance can degrade substantially when the unlabelED dataset contains out-of-class examples.

### Temporal Ensembling for Semi-Supervised Learning

- Computer ScienceICLR
- 2017

Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.

### Deep Hybrid Models: Bridging Discriminative and Generative Approaches

- Computer Science
- 2017

A new framework to combine a broad class of discriminative and generative models, interpolating between the two extremes with a multiconditional likelihood objective is proposed, which gives rise to deep hybrid models.

### Minimum variance semi-supervised boosting for multi-label classification

- Computer Science2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP)
- 2015

The experiments show that the proposed algorithm outperforms its supervised counterpart as well as the existing information theoretic based semi-supervised methods, and its performance is steadily improving as more unlabeled data is available.

### Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

- Computer ScienceNIPS
- 2017

The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.

### Auxiliary Deep Generative Models

- Computer ScienceICML
- 2016

This work extends deep generative models with auxiliary variables which improves the variational approximation and proposes a model with two stochastic layers and skip connections which shows state-of-the-art performance within semi-supervised learning on MNIST, SVHN and NORB datasets.

### Deep Kernel Learning

- Computer ScienceAISTATS
- 2016

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs…