Locally induced Gaussian processes for large-scale simulation experiments

  title={Locally induced Gaussian processes for large-scale simulation experiments},
  author={D. Austin Cole and R. Christianson and Robert B. Gramacy},
  journal={Statistics and Computing},
Gaussian processes (GPs) serve as flexible surrogates for complex surfaces, but buckle under the cubic cost of matrix decompositions with big training data sizes. Geospatial and machine learning communities suggest pseudo-inputs, or inducing points, as one strategy to obtain an approximation easing that computational burden. However, we show how placement of inducing points and their multitude can be thwarted by pathologies, especially in large-scale dynamic response surface modeling tasks. As… Expand
Batch-sequential design and heteroskedastic surrogate modeling for delta smelt conservation
A batch sequential design scheme is proposed, generalizing one-at-a-time variance-based active learning for HetGP surrogates, as a means of keeping multi-core cluster nodes fully engaged with expensive runs. Expand
Large-scale local surrogate modeling of stochastic simulation experiments
Gaussian process (GP) regression in large-data contexts, which often arises in surrogate modeling of stochastic simulation experiments, is challenged by cubic runtimes. Coping with input-dependentExpand
Sensitivity Prewarping for Local Surrogate Modeling
A framework is proposed for incorporating information from a global sensitivity analysis into the surrogate model as an input rotation and rescaling preprocessing step and performs an input warping such that the “warped simulator” is equally sensitive to all input directions, freeing local models to focus on local dynamics. Expand
Active Learning for Deep Gaussian Process Surrogates.
This work transport a DGP's automatic warping of the input space and full uncertainty quantification, via a novel elliptical slice sampling (ESS) Bayesian posterior inferential scheme, through to active learning (AL) strategies that distribute runs non-uniformly in theinput space -- something an ordinary (stationary) GP could not do. Expand


laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R
This work discusses an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures. Expand
Speeding Up Neighborhood Search in Local Gaussian Process Prediction
This work studies how predictive variance is reduced as local designs are built up for prediction, and suggests that searching the space radially, that is, continuously along rays emanating from the predictive location of interest, is a far thriftier alternative. Expand
Distance-Distributed Design for Gaussian Process Surrogates
This work studies the distribution of pairwise distances between design elements, and develops a numerical scheme to optimize those distances for a given sample size and dimension, and proposes a family of new schemes by reverse engineering the qualities of the random designs which give the best estimates of GP length scales. Expand
Gaussian predictive process models for large spatial data sets.
This work achieves the flexibility to accommodate non-stationary, non-Gaussian, possibly multivariate, possibly spatiotemporal processes in the context of large data sets in the form of a computational template encompassing these diverse settings. Expand
Exact Gaussian Processes on a Million Data Points
A scalable approach for exact GPs is developed that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication, and is generally applicable, without constraints to grid data or specific kernel classes. Expand
Emulating Satellite Drag from Large Simulation Experiments
This paper shows how extensions to the local approximate Gaussian Process (laGP) method allow accurate full-scale emulation, and demonstrates that the method achieves the desired level of accuracy, when trained on seventy thousand core hours of drag simulations for two real-world satellites. Expand
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets
A class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets are developed and it is established that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. Expand
Sparse Gaussian Processes using Pseudo-inputs
It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime. Expand
Local Gaussian Process Approximation for Large Computer Experiments
A family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data are derived, enabling a global predictor able to take advantage of modern multicore architectures. Expand
Mercer kernels and integrated variance experimental design: connections between Gaussian process regression and polynomial approximation
This paper introduces algorithms for minimizing a posterior integrated variance (IVAR) design criterion for GP regression, and shows how IVAR-optimal designs, while sacrificing discrete orthogonality of the kernel eigenfunctions, can yield lower approximation error than orthogonalizing point sets. Expand