Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

  title={Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity},
  author={Florian Sch{\"a}fer and Timothy John Sullivan and Houman Owhadi},
  journal={Multiscale Model. Simul.},
Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}1≤i≤N arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset S⊂{1,…,N}×{1,…,N}, with #S=O(Nlog(N)log^d(N/ϵ)), such that the zero fill-in block-incomplete Cholesky decomposition of Θ_(i,j)1_((i,j)∈S) is an… 
Sparse Cholesky factorization by Kullback-Leibler minimization
An elimination ordering and sparsity pattern is proposed that allows us to compute ϵ-approximate inverse Cholesky factors of such Θ in computational complexity O(Nlog(N/ϵ)^d) in space and O( Nlog( N/δ)^(2d)) in time, which is the best asymptotic complexity for this class of problems.
Preconditioning for Scalable Gaussian Process Hyperparameter Optimization
This work proves general probabilistic error bounds for the preconditioned computation of the log-determinant, log-marginal likelihood and its derivatives, and derives specific rates for a range of kernel-preconditioner combinations, showing that up to exponential convergence can be achieved.
Posterior and Computational Uncertainty in Gaussian Processes
A new class of methods is developed that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended, and the consequences of ignoring computational uncertainty are demonstrated.
Samplets: A new paradigm for data compression
The concept of samplets is introduced by transferring the construction of Tausch-White wavelets to the realm of data to obtain a multilevel representation of discrete data which directly enables data compression, detection of singularities and adaptivity.
Learning Partial Differential Equations in Reproducing Kernel Hilbert Spaces
A new data-driven approach for learning the fundamental solutions of various linear partial differential equations (PDEs) given sample pairs of input-output functions by estimating the bestfit Green’s function and bias term in a reproducing kernel Hilbert space (RKHS) which allows to regularize their smoothness and impose various structural constraints.
Certified and fast computations with shallow covariance kernels
This work introduces and analyze a new and certified algorithm for the low-rank approximation of a parameterized family of covariance operators which represents an extension of the adaptive cross approximation method for symmetric positive definite matrices.
Graph-Induced Rank Structures and their Representations
The notion of a graph-induced rank structure (GIRS) which aims to capture the fine low rank structures which appear in sparse matrices and their inverses in terms of the adjacency graph $\mathbb{G}$.
De-noising by thresholding operator adapted wavelets Gene
Donoho and Johnstone [13] proposed a method from reconstructing an unknown smooth function u from noisy data u+ ζ by translating the empirical wavelet coefficients of u+ ζ towards zero. We consider
Fast Eigenpairs Computation with Operator Adapted Wavelets and Hierarchical Subspace Correction
The proposed algorithm is robust for the presence of multiple (a continuum of) scales and is shown to be of near-linear complexity when $\mathcal{L}$ is an (arbitrary local, e.g.~differential) operator mapping $\ mathscal{H}^s_0(\Omega)$ to $\mathCal{H}\Omega$ (e. g.~an elliptic PDE with rough coefficients).


Compressing Rank-Structured Matrices via Randomized Sampling
The proposed scheme is also useful in simplifying the implementation of certain operations on rank-structured matrices such as the matrix-matrix multiplication, low-rank update, addition, etc.
Fast Direct Methods for Gaussian Processes
This work shows that for the most commonly used covariance functions, the matrix C can be hierarchically factored into a product of block low-rank updates of the identity matrix, yielding an O(n log2 n) algorithm for inversion and enables the evaluation of the determinant det(C), permitting the direct calculation of probabilities in high dimensions under fairly broad assumptions on the kernel defining K.
Data-sparse Approximation by Adaptive ℋ2-Matrices
The basic ideas of ℋ- andℋ2-matrices are introduced and an algorithm that adaptively computes approximations of general matrices in the latter format is presented.
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.
Approximation of eigenfunctions in kernel-based spaces
This paper examines the corresponding eigenspaces of continuous kernels on compact domains and proves that they have optimality properties among all other subspaces of ℋ$\mathcal {H}$ and gives error bounds for the numerical approximation of the eIGensystem via suchSubspaces.
LU Factorization of Non-standard Forms and Direct Multiresolution Solvers
This paper introduces the multiresolution LU factorization of non-standard forms (NS-forms) and develops fast directMultiresolution methods for solving systems of linear algebraic equations arising in elliptic problems and demonstrates how to construct the inverse in O ( N ) operations.
Localization of Matrix Factorizations
A comprehensive framework is derived to rigorously answer the question of when and under what conditions the matrix factors inherit the localization of the original matrix for such fundamental matrix factorizations as the LU, QR, Cholesky, and polar factorizations.
Efficient SVM Training Using Low-Rank Kernel Representations
This work shows that for a low rank kernel matrix it is possible to design a better interior point method (IPM) in terms of storage requirements as well as computational complexity and derives an upper bound on the change in the objective function value based on the approximation error and the number of active constraints (support vectors).
A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices
This paper is the first of a series and is devoted to the first introduction of the H-matrix concept, which allows the exact inversion of tridiagonal matrices.
Localization in Matrix Computations: Theory and Applications
Many important problems in mathematics and physics lead to (non-sparse) functions, vectors, or matrices in which the fraction of nonnegligible entries is vanishingly small compared the total number