# Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

@article{Schfer2021CompressionIA, title={Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity}, author={Florian Sch{\"a}fer and Timothy John Sullivan and Houman Owhadi}, journal={Multiscale Model. Simul.}, year={2021}, volume={19}, pages={688-730} }

Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}1≤i≤N arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset S⊂{1,…,N}×{1,…,N}, with #S=O(Nlog(N)log^d(N/ϵ)), such that the zero fill-in block-incomplete Cholesky decomposition of Θ_(i,j)1_((i,j)∈S) is an…

## Figures and Tables from this paper

## 45 Citations

Kernel Flows: from learning kernels from data into the abyss

- Computer ScienceJ. Comput. Phys.
- 2019

Sparse Cholesky factorization by Kullback-Leibler minimization

- Computer ScienceSIAM J. Sci. Comput.
- 2021

An elimination ordering and sparsity pattern is proposed that allows us to compute ϵ-approximate inverse Cholesky factors of such Θ in computational complexity O(Nlog(N/ϵ)^d) in space and O( Nlog( N/δ)^(2d)) in time, which is the best asymptotic complexity for this class of problems.

Posterior and Computational Uncertainty in Gaussian Processes

- Computer ScienceArXiv
- 2022

A new class of methods is developed that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended, and the consequences of ignoring computational uncertainty are demonstrated.

Samplets: A new paradigm for data compression

- Computer ScienceArXiv
- 2021

The concept of samplets is introduced by transferring the construction of Tausch-White wavelets to the realm of data to obtain a multilevel representation of discrete data which directly enables data compression, detection of singularities and adaptivity.

Learning Partial Differential Equations in Reproducing Kernel Hilbert Spaces

- Mathematics, Computer ScienceArXiv
- 2021

A new data-driven approach for learning the fundamental solutions of various linear partial differential equations (PDEs) given sample pairs of input-output functions by estimating the bestfit Green’s function and bias term in a reproducing kernel Hilbert space (RKHS) which allows to regularize their smoothness and impose various structural constraints.

Certified and fast computations with shallow covariance kernels

- Computer ScienceFoundations of Data Science
- 2020

This work introduces and analyze a new and certified algorithm for the low-rank approximation of a parameterized family of covariance operators which represents an extension of the adaptive cross approximation method for symmetric positive definite matrices.

Graph-Induced Rank Structures and their Representations

- Mathematics, Computer ScienceArXiv
- 2019

The notion of a graph-induced rank structure (GIRS) which aims to capture the fine low rank structures which appear in sparse matrices and their inverses in terms of the adjacency graph $\mathbb{G}$.

De-noising by thresholding operator adapted wavelets Gene

- Mathematics
- 2018

Donoho and Johnstone [13] proposed a method from reconstructing an unknown smooth function u from noisy data u+ ζ by translating the empirical wavelet coefficients of u+ ζ towards zero. We consider…

Fast Eigenpairs Computation with Operator Adapted Wavelets and Hierarchical Subspace Correction

- Computer ScienceSIAM J. Numer. Anal.
- 2019

The proposed algorithm is robust for the presence of multiple (a continuum of) scales and is shown to be of near-linear complexity when $\mathcal{L}$ is an (arbitrary local, e.g.~differential) operator mapping $\ mathscal{H}^s_0(\Omega)$ to $\mathCal{H}\Omega$ (e. g.~an elliptic PDE with rough coefficients).

Kernel Mode Decomposition and programmable/interpretable regression networks

- Computer ScienceArXiv
- 2019

The proposed framework for programmable and interpretable regression networks for pattern recognition and address mode decomposition as a prototypical problem is introduced and the structure of some of these networks share intriguing similarities with convolutional neural networks while being interpretable, programable and amenable to theoretical analysis.

## References

SHOWING 1-10 OF 174 REFERENCES

Compressing Rank-Structured Matrices via Randomized Sampling

- Computer ScienceSIAM J. Sci. Comput.
- 2016

The proposed scheme is also useful in simplifying the implementation of certain operations on rank-structured matrices such as the matrix-matrix multiplication, low-rank update, addition, etc.

Data-sparse Approximation by Adaptive ℋ2-Matrices

- Computer Science, MathematicsComputing
- 2002

The basic ideas of ℋ- andℋ2-matrices are introduced and an algorithm that adaptively computes approximations of general matrices in the latter format is presented.

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

- Computer ScienceSIAM Rev.
- 2011

This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.

Approximation of eigenfunctions in kernel-based spaces

- Mathematics, Computer ScienceAdv. Comput. Math.
- 2016

This paper examines the corresponding eigenspaces of continuous kernels on compact domains and proves that they have optimality properties among all other subspaces of ℋ$\mathcal {H}$ and gives error bounds for the numerical approximation of the eIGensystem via suchSubspaces.

LU Factorization of Non-standard Forms and Direct Multiresolution Solvers

- Computer Science, Mathematics
- 1998

This paper introduces the multiresolution LU factorization of non-standard forms (NS-forms) and develops fast directMultiresolution methods for solving systems of linear algebraic equations arising in elliptic problems and demonstrates how to construct the inverse in O ( N ) operations.

Localization of Matrix Factorizations

- Computer ScienceFound. Comput. Math.
- 2015

A comprehensive framework is derived to rigorously answer the question of when and under what conditions the matrix factors inherit the localization of the original matrix for such fundamental matrix factorizations as the LU, QR, Cholesky, and polar factorizations.

Efficient SVM Training Using Low-Rank Kernel Representations

- Computer ScienceJ. Mach. Learn. Res.
- 2001

This work shows that for a low rank kernel matrix it is possible to design a better interior point method (IPM) in terms of storage requirements as well as computational complexity and derives an upper bound on the change in the objective function value based on the approximation error and the number of active constraints (support vectors).

A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices

- Computer ScienceComputing
- 1999

This paper is the first of a series and is devoted to the first introduction of the H-matrix concept, which allows the exact inversion of tridiagonal matrices.

Localization in Matrix Computations: Theory and Applications

- Mathematics
- 2016

Many important problems in mathematics and physics lead to (non-sparse) functions, vectors, or matrices in which the fraction of nonnegligible entries is vanishingly small compared the total number…

Randomized matrix-free trace and log-determinant estimators

- Computer Science, MathematicsNumerische Mathematik
- 2017

Random algorithms for estimating the trace and determinant of Hermitian positive semi-definite matrices and the error due to randomization are presented, for starting guesses whose elements are Gaussian or Rademacher random variables.