• Corpus ID: 44061130

Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

  title={Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance},
  author={Neal Jean and Sang Michael Xie and Stefano Ermon},
Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning (SSDKL), a semi-supervised regression model based on minimizing predictive variance in the posterior regularization framework. SSDKL combines the hierarchical representation learning of neural networks with the probabilistic modeling capabilities of Gaussian… 

Figures and Tables from this paper

Metric-Based Semi-Supervised Regression

The experimental results indicate that the proposed method to cope with semi-supervised regression problems achieves promising results and could capture the trend of a non-linear function and normally predict well even though this dataset comprises extreme outliers.

Deep Low-Density Separation for Semi-supervised Classification

A novel hybrid method that applies low-density separation to the embedded features of neural network-based embeddings and effectively classifies thousands of unlabeled users from a relatively small number of hand-classified examples is introduced.

Deep kernels with probabilistic embeddings for small-data learning

The proposed approach maps high-dimensional data to a probability distribution in a low dimensional subspace and then computes a kernel between these distributions to capture similarity and derive a functional gradient descent procedure for training the model.

FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering

This work proposes to leverage matrix Fisher distribution to build a probabilistic model of rotation and devise a matrix Fisher-based regressor for jointly predicting rotation along with its prediction uncertainty, and proposes to use the entropy of the predicted distribution as a confidence measure, which enables us to perform pseudo label filtering for rotation regression.

Lautum Regularization for Semi-Supervised Transfer Learning

The theory suggests that one may improve the transferability of a deep neural network by imposing a Lautum information based regularization that relates the network weights to the target data.

Few-shot learning for spatial regression via neural embedding-based Gaussian processes

The proposed few-shot learning method for spatial regression achieves better predictive performance than existing meta-learning methods using spatial datasets and can be trained efficiently and effectively so that the test predictive performance improves when adapted to newly given small data.

Deep Probabilistic Kernels for Sample-Efficient Learning

Deep Probabilistic kernels are proposed which use a probabilistic neural network to map high-dimensional data to a probability distribution in a low dimensional subspace, and leverage the rich work on kernels between distributions to capture the similarity between these distributions.

A Simple yet Effective Baseline for Robust Deep Learning with Noisy Labels

This work proposes a simple but effective baseline that is robust to noisy labels, even with severe noise, and involves a variance regularization term that implicitly penalizes the Jacobian norm of the neural network on the whole training set (including the noisy-labeled data), which encourages generalization and prevents overfitting to the corrupted labels.

NP-Match: When Neural Processes meet Semi-Supervised Learning

NPs are adjusted to the semi-supervised image classification task, resulting in a new method named NP-Match, which outperforms state-of-the-art (SOTA) results or achieves competitive results on them, which shows the effectiveness of NP- match and its potential for SSL.

USB: A Unified Semi-supervised Learning Benchmark for Classification

A Unified SSL Benchmark (USB) is constructed by select-ing 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which to systematically evaluate the dominant SSL methods, and open-source a modular and extensible codebase for fair evaluation.



Semi-supervised Learning by Entropy Minimization

This framework, which motivates minimum entropy regularization, enables to incorporate unlabeled data in the standard supervised learning, and includes other approaches to the semi-supervised problem as particular or limiting cases.

Semi-Supervised Regression with Co-Training

Experiments show that COREG can effectively exploit unlabeled data to improve regression estimates and is proposed as a co-training style semi-supervised regression algorithm.

Stochastic Variational Deep Kernel Learning

An efficient form of stochastic variational inference is derived which leverages local kernel interpolation, inducing points, and structure exploiting algebra within this framework to enable classification, multi-task learning, additive covariance structures, and Stochastic gradient training.

Realistic Evaluation of Semi-Supervised Learning Algorithms

This work creates a unified reimplemention and evaluation platform of various widelyused SSL techniques and finds that the performance of simple baselines which do not use unlabeled data is often underreported, that SSL methods differ in sensitivity to the amount of labeled and unlabeling data, and that performance can degrade substantially when the unlabelED dataset contains out-of-class examples.

Temporal Ensembling for Semi-Supervised Learning

Self-ensembling is introduced, where it is shown that this ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training.

Deep Hybrid Models: Bridging Discriminative and Generative Approaches

A new framework to combine a broad class of discriminative and generative models, interpolating between the two extremes with a multiconditional likelihood objective is proposed, which gives rise to deep hybrid models.

Minimum variance semi-supervised boosting for multi-label classification

The experiments show that the proposed algorithm outperforms its supervised counterpart as well as the existing information theoretic based semi-supervised methods, and its performance is steadily improving as more unlabeled data is available.

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.

Auxiliary Deep Generative Models

This work extends deep generative models with auxiliary variables which improves the variational approximation and proposes a model with two stochastic layers and skip connections which shows state-of-the-art performance within semi-supervised learning on MNIST, SVHN and NORB datasets.

Deep Kernel Learning

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs