Exact Gaussian Processes for Massive Datasets via Non-Stationary Sparsity-Discovering Kernels

@article{Noack2022ExactGP,
  title={Exact Gaussian Processes for Massive Datasets via Non-Stationary Sparsity-Discovering Kernels},
  author={Marcus Michael Noack and Harinarayan Krishnan and Mark D. Risser and Kristofer G. Reyes},
  journal={ArXiv},
  year={2022},
  volume={abs/2205.09070}
}
A Gaussian Process (GP) is a prominent mathematical framework for stochastic function approximation in science and engineering applications. This success is largely attributed to the GP’s analytical tractability, robustness, non-parametric structure, and natural inclusion of uncertainty quantification. Un-fortunately, the use of exact GPs is prohibitively expensive for large datasets due to their unfavorable numerical complexity of O ( N 3 ) in computation and O ( N 2 ) in storage. All existing… 

Figures from this paper

References

SHOWING 1-10 OF 21 REFERENCES
A Sparse Covariance Function for Exact Gaussian Process Inference in Large Datasets
TLDR
A new stationary covariance function (Mercer kernel) is constructed that naturally provides a sparse covariance matrix that enables exact GP inference and performs comparatively to the squared-exponential one, at a lower computational cost.
Exact Gaussian Processes on a Million Data Points
TLDR
A scalable approach for exact GPs is developed that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication, and is generally applicable, without constraints to grid data or specific kernel classes.
A General Framework for Vecchia Approximations of Gaussian Processes
TLDR
It is shown that the general Vecchia approach contains many popular existing GP approximations as special cases, allowing for comparisons among the different methods within a unified framework.
Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP)
TLDR
A new structured kernel interpolation (SKI) framework is introduced, which generalises and unifies inducing point methods for scalable Gaussian processes (GPs) and naturally enables Kronecker and Toeplitz algebra for substantial additional gains in scalability.
Generalized Local Aggregation for Large Scale Gaussian Process Regression
TLDR
This work generalizes the traditional mutual-information-based methods (GPoE, RBCM, GRBCM) based on Tsallis mutual information and proposes three heuristic algorithms to solve the model of Gaussian process regression.
Fixed rank kriging for very large spatial data sets
Summary.  Spatial statistics for very large spatial data sets is challenging. The size of the data set, n, causes problems in computing optimal spatial predictors such as kriging, since its
Gaussian predictive process models for large spatial data sets
TLDR
This work achieves the flexibility to accommodate non‐stationary, non‐Gaussian, possibly multivariate, possibly spatiotemporal processes in the context of large data sets in the form of a computational template encompassing these diverse settings.
Healing Products of Gaussian Process Experts
TLDR
This work leverages the optimal transport literature and proposes a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification.
Gaussian Processes for Machine Learning
TLDR
The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.
On the optimization of hyperparameters in Gaussian process regression
TLDR
It is shown that choosing hyperparameters based on a criterion of the completeness of the basis in the corresponding linear regression problem is superior to MLE, facilitated by the use of High-dimensional model representation whereby a low-order HDMR representation can provide reliable reference functions and large synthetic test data sets needed for basis parameter optimization even with few data.
...
...