# Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation

@article{Bakshi2022SubquadraticAF,
title={Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation},
author={Ainesh Bakshi and Piotr Indyk and Praneeth Kacham and Sandeep Silwal and Samson Zhou},
journal={ArXiv},
year={2022},
volume={abs/2212.00642}
}
• Published 1 December 2022
• Computer Science
• ArXiv
Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency – given n input points, most kernel-based algorithms need to materialize the full n× n kernel matrix before performing any subsequent computation, thus incurring Ω(n2) runtime. Breaking this quadratic barrier for various problems has therefore, been…
1 Citations

## Figures and Tables from this paper

• Computer Science
ArXiv
• 2023
Improvement is given in the regime $\delta \rightarrow 1$ where the regret of the algorithm approaches $\tilde{O}_n(\sqrt{T})$, matching the $T$ dependence in the standard online setting without space restrictions.

## References

SHOWING 1-10 OF 81 REFERENCES

• Computer Science, Mathematics
ICML
• 2021
It is shown that the sum of matrix entries can be estimated to 1 + relative error in time sublinear in n and linear in d for many popular kernels, including the Gaussian, exponential, and rational quadratic.
• Computer Science
2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)
• 2018
This paper presents a collection of algorithms for efficient KDF evaluation under the assumptions that the kernel k is "smooth", i.e. the value changes at most polynomially with the distance, and presents a general reduction from density estimation to approximate near neighbor in the underlying space.
• Computer Science
2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)
• 2020
This paper gives a single data structure based on classical near neighbor search techniques that improves upon or essentially matches the query time and space complexity for all radial kernels considered in the literature so far, and shows how to improve both the query complexity and runtime by using recent advances in data-dependentNear neighbor search.
• Computer Science
SDM
• 2003
This paper presents an algorithm for kernel density estimation, the chief nonparametric approach, which is dramatically faster than previous algorithmic approaches in terms of both dataset size and dimensionality and is an instance of a new principle of algorithm design: multi-recursion, or higher-order algorithm design.
• Computer Science
ArXiv
• 2012
We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r, with probability at least 9/10, ∥SAx∥2 = (1 ± ε)∥Ax∥2 simultaneously for all x ∈ Rd. Here, m is
• Computer Science
AISTATS
• 2022
This work presents an algorithm called Density Estimation from Approximate Nearest Neighbors (DEANN) where the idea is to find points that have a large contribution to the KDE using ANN, compute their contribution exactly, and approximate the remainder with Random Sampling (RS).
• P. RamDongryeol Lee
• Computer Science
NIPS
• 2009
This paper shows for the first time O(N) worst case runtimes for practical algorithms for pairwise distance computations, including all-nearest-neighbors, based on the cover tree data structure.
• Computer Science, Mathematics
Internet Math.
• 2010
This paper presents an efficient triangle-counting approximation algorithm that can be adapted to the semistreaming model with space usage and a constant number of passes over the graph stream, and applies its methods to various networks with several millions of edges and gets excellent results.
• Dongryeol Lee
• Computer Science
NIPS
• 2005
The extent to which the dual-tree recursion with finite-difference approximation can be integrated with multipole-like Hermite expansions in order to achieve reasonable efficiency across all bandwidth scales is explored, though only for low dimensionalities.
• Computer Science
NIPS
• 2008
This work provides an algorithm that combines tree methods with the Improved Fast Gauss Transform (IFGT) and employs a tree data structure, resulting in four evaluation methods whose performance varies based on the distribution of sources and targets and input parameters such as desired accuracy and bandwidth.