• Corpus ID: 43744008

Doubly Stochastic Variational Inference for Deep Gaussian Processes

@inproceedings{Salimbeni2017DoublySV,
  title={Doubly Stochastic Variational Inference for Deep Gaussian Processes},
  author={Hugh Salimbeni and Marc Peter Deisenroth},
  booktitle={NIPS},
  year={2017}
}
Gaussian processes (GPs) are a good choice for function approximation as they are flexible, robust to over-fitting, and provide well-calibrated predictive uncertainty. Deep Gaussian processes (DGPs) are multi-layer generalisations of GPs, but inference in these models has proved challenging. Existing approaches to inference in DGP models assume approximate posteriors that force independence between the layers, and do not work well in practice. We present a doubly stochastic variational… 

Figures and Tables from this paper

Compositional uncertainty in deep Gaussian processes
TLDR
It is argued that such an inference scheme is suboptimal, not taking advantage of the potential of the model to discover the compositional structure in the data, and examines alternative variational inference schemes allowing for dependencies across different layers.
Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo
TLDR
This work provides evidence for the non-Gaussian nature of the posterior and applies the Stochastic Gradient Hamiltonian Monte Carlo method to generate samples, which results in significantly better predictions at a lower computational cost than its VI counterpart.
Structured Variational Inference for Coupled Gaussian Processes
TLDR
Previous sparse GP approximations are extended and a novel parameterization of variational posteriors in the multi-GP setting is proposed allowing for fast and scalable inference capturing posterior dependencies.
Deep Gaussian processes using expectation propagation and Monte Carlo methods
TLDR
This work presents a new method for inference in DGPs using an approximate inference technique based on Monte Carlo methods and the expectation propagation algorithm that is able to capture output noise that is dependent on the input and it can generate multimodal predictive distributions.
Deep Variational Implicit Processes
TLDR
A scalable variational inference algorithm for training DVIP is described and it is shown that it outperforms previous IPbased methods and also deep GPs, and is evaluated on large datasets with up to several million data instances to illustrate its good scalability and performance.
Deep Gaussian Processes with Decoupled Inducing Inputs
TLDR
This work shows that the computational cost of deep Gaussian Processes can be reduced with no loss in performance by using a separate, smaller set of pseudo points when calculating the layerwise variance while using a larger set of Pseudo Points when calculatingThe layerwise mean.
Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations
TLDR
This work shows that, by revisiting old model approximations such as the fully-independent training conditionals endowed with powerful sampling-based inference methods, treating both inducing locations and GP hyper-parameters in a Bayesian way can improve performance significantly.
Scalable Training of Inference Networks for Gaussian-Process Models
TLDR
This work proposes an algorithm that enables minibatch training by tracking a stochastic, functional mirror-descent algorithm that only requires considering a finite number of input locations, resulting in a scalable and easy-to-implement algorithm.
Generic Inference in Latent Gaussian Process Models
TLDR
An automated variational method for inference in models with Gaussian process (GP) priors and general likelihoods and is scalable to large datasets by using an augmented prior via the inducing-variable approach underpinning most sparse GP approximations, along with parallel computation and stochastic optimization.
Rethinking Sparse Gaussian Processes: Bayesian Approaches to Inducing-Variable Approximations
TLDR
This work develops a fully Bayesian approach to scalable GP and deep GP models, and shows that treating both inducing locations and GP hyper-parameters in a Bayesian way, by inferring their full posterior, further significantly improves performance.
...
...

References

SHOWING 1-10 OF 44 REFERENCES
Training Deep Gaussian Processes with Sampling
TLDR
This workshop paper proposes a stochastic gradient algorithm which relies on sampling to circumvent the intractability hurdle and uses pseudo data to ease the computational burden.
Random Feature Expansions for Deep Gaussian Processes
TLDR
A novel formulation of DGPs based on random feature expansions that is trained using stochastic variational inference and yields a practical learning framework which significantly advances the state-of-the-art in inference for DGPs, and enables accurate quantification of uncertainty.
Generic Inference in Latent Gaussian Process Models
TLDR
An automated variational method for inference in models with Gaussian process (GP) priors and general likelihoods and is scalable to large datasets by using an augmented prior via the inducing-variable approach underpinning most sparse GP approximations, along with parallel computation and stochastic optimization.
Deep Gaussian Processes for Regression using Approximate Expectation Propagation
TLDR
A new approximate Bayesian learning scheme is developed that enables DGPs to be applied to a range of medium to large scale regression problems for the first time and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks.
Sequential Inference for Deep Gaussian Process
TLDR
This paper proposes an efficient sequential inference framework for DGP, where the data is processed sequentially, and proposes two DGP extensions to handle heteroscedasticity and multi-task learning.
Nested Variational Compression in Deep Gaussian Processes
TLDR
This paper extends variational compression to allow for approximate variational marginalization of the hidden variables leading to a lower bound on the marginal likelihood of the model, which can be easily parallelized or adapted for stochastic variational inference.
Gaussian Processes for Big Data
TLDR
Stochastic variational inference for Gaussian process models is introduced and it is shown how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform Variational inference.
Variational Auto-encoded Deep Gaussian Processes
TLDR
A new formulation of the variational lower bound is derived that allows for most of the computation to be distributed in a way that enables to handle datasets of the size of mainstream deep learning tasks.
Two problems with variational expectation maximisation for time-series models
Variational methods are a key component of the approximate inference and learning toolbox. These methods fill an important middle ground, retaining distributional information about uncertainty in
On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes
TLDR
A substantial generalization of the literature on variational framework for learning inducing variables is given and a new proof of the result for infinite index sets is given which allows inducing points that are not data points and likelihoods that depend on all function values.
...
...