A memory-free spatial additive mixed modeling for big spatial data

  title={A memory-free spatial additive mixed modeling for big spatial data},
  author={Daisuke Murakami and Daniel A. Griffith},
  journal={Japanese Journal of Statistics and Data Science},
  • D. Murakami, D. Griffith
  • Published 26 July 2019
  • Mathematics, Computer Science
  • Japanese Journal of Statistics and Data Science
This study develops a spatial additive mixed modeling (AMM) approach estimating spatial and non-spatial effects from large samples, such as millions of observations. Although fast AMM approaches are already well established, they are restrictive in that they assume a known spatial dependence structure. To overcome this limitation, this study develops a fast AMM with the estimation of spatial structure in residuals and regression coefficients together with non-spatial effects. We rely on a Moran… 
Balancing Spatial and Non‐Spatial Variation in Varying Coefficient Modeling: A Remedy for Spurious Correlation
This study discusses the importance of balancing spatial and non-spatial variation in spatial regression modeling. Unlike spatially varying coefficients (SVC) modeling, which is popular in spatial…
Scalable GWR: A Linear-Time Algorithm for Large-Scale Geographically Weighted Regression with Polynomial Kernels
The key improvement is the calibration of the model through a precompression of the matrices and vectors whose size depends on the sample size, prior to the leave-one-out cross-validation, which is the heaviest computational step in conventional GWR.
spmoran: An R package for Moran's eigenvector-based spatial regression analysis
The objective of this study is illustrating how to use "spmoran," which is an R package for Moran's eigenvector-based spatial regression analysis, which applies ESF and RE-ESF models for a land price analysis.
The GWR route map: a guide to the informed application of Geographically Weighted Regression
A route map is described to inform the choice of whether to use a GWR model or not, and if so which of three core variants to apply: a standard GWR, a mixed GWR or a multiscale GWR (MS-GWR).
Does financial deepening drive spatial heterogeneity of PM2.5 concentrations in China? New evidence from an eigenvector spatial filtering approach
Abstract To provide policymakers with a different perspective on reducing PM2.5 concentrations, this paper not only identifies the economic driving factors of PM2.5 concentrations in China but also…
Spatial heterogeneity and economic driving factors of SO2 emissions in China: Evidence from an eigenvector based spatial filtering approach
Sulfur dioxide (SO2) emissions have been a great challenge in China over the last few decades due to their serious impact on the environment and human health. In this paper, a random effect…
Investigating high-speed rail construction's support to county level regional development in China: An eigenvector based spatial filtering panel data analysis
The construction of high-speed rail in China was initially a direct response to the increasing demand of up-to-date infrastructure. It is commonly understood that the construction of HSR has…
Spatial regression modeling using the spmoran package: Boston housing price data examples
An approximate Gaussian process (GP or kriging model), which is interpretable in terms of the Moran coefficient (MC), is used for modeling the spatial process. The approximate GP is defined by a…
Compositionally-warped additive mixed modeling for a wide variety of non-Gaussian spatial data
A general framework for fast and flexible non-Gaussian regression, especially for spatial/spatiotemporal modeling is developed and the developed model, termed the compositionally-warped additive mixed model (CAMM), provides intuitively reasonable coefficient estimates and outperforms AMM in terms of prediction accuracy.
Scalable Model Selection for Spatial Additive Mixed Modeling: Application to Crime Analysis
A fast and practical model-selection approach for spatial regression models, focusing on the selection of coefficient types that include constant, spatially varying, and non-spatially varying coefficients, that is useful not only for selecting factors influencing crime risk but also for predicting crime events.


Spatially varying coefficient modeling for large datasets: Eliminating N from spatial regressions
Abstract While spatially varying coefficient (SVC) modeling is popular in applied science, its computational burden is substantial. This is especially true if a multiscale property of SVC is…
Limitations on low rank approximations for covariance matrices of spatial data
Abstract Evaluating the likelihood function for Gaussian models when a spatial process is observed irregularly is problematic for larger datasets due to constraints of memory and calculation. If the…
A Case Study Competition Among Methods for Analyzing Large Spatial Data
This study provides an introductory overview of several methods for analyzing large spatial data and describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology.
Gaussian predictive process models for large spatial data sets.
This work achieves the flexibility to accommodate non-stationary, non-Gaussian, possibly multivariate, possibly spatiotemporal processes in the context of large data sets in the form of a computational template encompassing these diverse settings.
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets
A class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets are developed and it is established that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices.
Generalized Additive Models for Gigadata: Modeling the U.K. Black Smoke Network Daily Data
Abstract We develop scalable methods for fitting penalized regression spline based generalized additive models with of the order of 104 coefficients to up to 108 data. Computational feasibility rests…
Fixed rank kriging for very large spatial data sets
Spatial statistics for very large spatial data sets is challenging. The size of the data set, "n", causes problems in computing optimal spatial predictors such as kriging, since its computational…
A Multi-Resolution Approximation for Massive Spatial Datasets
A multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space is proposed, which can capture spatial structure from very fine to very large scales.
Dimension reduction and alleviation of confounding for spatial generalized linear mixed models
This work proposes a new parameterization of the spatial generalized linear mixed model that alleviates spatial confounding and speeds computation by greatly reducing the dimension of theatial random effects.
Penalized basis models for very large spatial datasets
Under a Gaussianity assumption, this work proposes a graphical model family for the stochastic coefficients by parameterizing the precision matrix and develops a flexible nonstationary spatial model that is adaptable to very large datasets.