Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity
@article{Schfer2021CompressionIA, title={Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity}, author={Florian Sch{\"a}fer and Timothy John Sullivan and Houman Owhadi}, journal={Multiscale Model. Simul.}, year={2021}, volume={19}, pages={688-730} }
Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}1≤i≤N arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset S⊂{1,…,N}×{1,…,N}, with #S=O(Nlog(N)log^d(N/ϵ)), such that the zero fill-in block-incomplete Cholesky decomposition of Θ_(i,j)1_((i,j)∈S) is an…
Figures and Tables from this paper
44 Citations
Kernel Flows: from learning kernels from data into the abyss
- Computer ScienceJ. Comput. Phys.
- 2019
Sparse Cholesky factorization by Kullback-Leibler minimization
- Computer ScienceSIAM J. Sci. Comput.
- 2021
An elimination ordering and sparsity pattern is proposed that allows us to compute ϵ-approximate inverse Cholesky factors of such Θ in computational complexity O(Nlog(N/ϵ)^d) in space and O( Nlog( N/δ)^(2d)) in time, which is the best asymptotic complexity for this class of problems.
Preconditioning for Scalable Gaussian Process Hyperparameter Optimization
- Computer Science
- 2021
This work proves general probabilistic error bounds for the preconditioned computation of the log-determinant, log-marginal likelihood and its derivatives, and derives specific rates for a range of kernel-preconditioner combinations, showing that up to exponential convergence can be achieved.
Posterior and Computational Uncertainty in Gaussian Processes
- Computer ScienceArXiv
- 2022
A new class of methods is developed that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended, and the consequences of ignoring computational uncertainty are demonstrated.
Samplets: A new paradigm for data compression
- Computer ScienceArXiv
- 2021
The concept of samplets is introduced by transferring the construction of Tausch-White wavelets to the realm of data to obtain a multilevel representation of discrete data which directly enables data compression, detection of singularities and adaptivity.
Learning Partial Differential Equations in Reproducing Kernel Hilbert Spaces
- Mathematics, Computer ScienceArXiv
- 2021
A new data-driven approach for learning the fundamental solutions of various linear partial differential equations (PDEs) given sample pairs of input-output functions by estimating the bestfit Green’s function and bias term in a reproducing kernel Hilbert space (RKHS) which allows to regularize their smoothness and impose various structural constraints.
Certified and fast computations with shallow covariance kernels
- Computer ScienceArXiv
- 2020
This work introduces and analyze a new and certified algorithm for the low-rank approximation of a parameterized family of covariance operators which represents an extension of the adaptive cross approximation method for symmetric positive definite matrices.
Graph-Induced Rank Structures and their Representations
- Mathematics, Computer ScienceArXiv
- 2019
The notion of a graph-induced rank structure (GIRS) which aims to capture the fine low rank structures which appear in sparse matrices and their inverses in terms of the adjacency graph $\mathbb{G}$.
De-noising by thresholding operator adapted wavelets Gene
- Mathematics
- 2018
Donoho and Johnstone [13] proposed a method from reconstructing an unknown smooth function u from noisy data u+ ζ by translating the empirical wavelet coefficients of u+ ζ towards zero. We consider…
Fast Eigenpairs Computation with Operator Adapted Wavelets and Hierarchical Subspace Correction
- Computer ScienceSIAM J. Numer. Anal.
- 2019
The proposed algorithm is robust for the presence of multiple (a continuum of) scales and is shown to be of near-linear complexity when $\mathcal{L}$ is an (arbitrary local, e.g.~differential) operator mapping $\ mathscal{H}^s_0(\Omega)$ to $\mathCal{H}\Omega$ (e. g.~an elliptic PDE with rough coefficients).
References
SHOWING 1-10 OF 176 REFERENCES
Compressing Rank-Structured Matrices via Randomized Sampling
- Computer ScienceSIAM J. Sci. Comput.
- 2016
The proposed scheme is also useful in simplifying the implementation of certain operations on rank-structured matrices such as the matrix-matrix multiplication, low-rank update, addition, etc.
Fast Direct Methods for Gaussian Processes
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2016
This work shows that for the most commonly used covariance functions, the matrix C can be hierarchically factored into a product of block low-rank updates of the identity matrix, yielding an O(n log2 n) algorithm for inversion and enables the evaluation of the determinant det(C), permitting the direct calculation of probabilities in high dimensions under fairly broad assumptions on the kernel defining K.
Data-sparse Approximation by Adaptive ℋ2-Matrices
- Computer Science, MathematicsComputing
- 2002
The basic ideas of ℋ- andℋ2-matrices are introduced and an algorithm that adaptively computes approximations of general matrices in the latter format is presented.
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
- Computer ScienceSIAM Rev.
- 2011
This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.
Approximation of eigenfunctions in kernel-based spaces
- Mathematics, Computer ScienceAdv. Comput. Math.
- 2016
This paper examines the corresponding eigenspaces of continuous kernels on compact domains and proves that they have optimality properties among all other subspaces of ℋ$\mathcal {H}$ and gives error bounds for the numerical approximation of the eIGensystem via suchSubspaces.
LU Factorization of Non-standard Forms and Direct Multiresolution Solvers
- Computer Science, Mathematics
- 1998
This paper introduces the multiresolution LU factorization of non-standard forms (NS-forms) and develops fast directMultiresolution methods for solving systems of linear algebraic equations arising in elliptic problems and demonstrates how to construct the inverse in O ( N ) operations.
Localization of Matrix Factorizations
- Computer ScienceFound. Comput. Math.
- 2015
A comprehensive framework is derived to rigorously answer the question of when and under what conditions the matrix factors inherit the localization of the original matrix for such fundamental matrix factorizations as the LU, QR, Cholesky, and polar factorizations.
Efficient SVM Training Using Low-Rank Kernel Representations
- Computer ScienceJ. Mach. Learn. Res.
- 2001
This work shows that for a low rank kernel matrix it is possible to design a better interior point method (IPM) in terms of storage requirements as well as computational complexity and derives an upper bound on the change in the objective function value based on the approximation error and the number of active constraints (support vectors).
A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices
- Computer ScienceComputing
- 1999
This paper is the first of a series and is devoted to the first introduction of the H-matrix concept, which allows the exact inversion of tridiagonal matrices.
Localization in Matrix Computations: Theory and Applications
- Mathematics
- 2016
Many important problems in mathematics and physics lead to (non-sparse) functions, vectors, or matrices in which the fraction of nonnegligible entries is vanishingly small compared the total number…