• Corpus ID: 8782748

A Gentle Introduction to the Kernel Distance

@article{Phillips2011AGI,
  title={A Gentle Introduction to the Kernel Distance},
  author={J. M. Phillips and Suresh Venkatasubramanian},
  journal={ArXiv},
  year={2011},
  volume={abs/1103.1625}
}
This document reviews the definition of the kernel distance, providing a gentle introduction tailored to a reader with background in theoretical computer science, but limited exposure to technology more common to machine learning, functional analysis and geometric measure theory. The key aspect of the kernel distance developed here is its interpretation as an L2 distance between probability measures or various shapes (e.g. point sets, curves, surfaces) embedded in a vector space (specifically… 
Comparing distributions and shapes using the kernel distance
TLDR
This paper presents fast approximation algorithms for computing the kernel distance between two point sets P and Q that runs in near-linear time in the size of P ∪ Q (an explicit calculation would take quadratic time).
A Geometric Algorithm for Scalable Multiple Kernel Learning
TLDR
This work reinterpreted the problem of learning kernel weights as searching for a kernel that maximizes the minimum (kernel) distance between two convex polytopes to reduce the MKL problem to a simple optimization routine that yields provable convergence as well as quality guarantees.
A New Distance for Data Sets in a Reproducing Kernel Hilbert Space Context
TLDR
Kernels for data sets that provide a metrization of the power set are introduced and kernel distances that rely on the estimation of density level sets of the underlying data distributions are proposed that can be extended from data sets to probability measures.
N ov 2 01 8 Relative Error RKHS Embeddings for Gaussian Kernels
TLDR
The main insight is to effectively modify the well-traveled random Fourier features to be slightly biased and have higher variance, but so they can be defined as a convolution over the function space.
The GaussianSketch for Almost Relative Error Kernel Distance
We introduce two versions of a new sketch for approximately embedding the Gaussian kernel into Euclidean inner product space. These work by truncating infinite expansions of the Gaussian kernel, and
A new distance for data sets (and probability measures) in a RKHS context
TLDR
This paper proposes kernels for data sets that provide a metrization of the set of points sets (the power set) that rely on the estimation of density level sets of the underlying distribution and can be extended from data sets to probability measures.
Improved Coresets for Kernel Density Estimates
TLDR
This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called kernel herding, which unites a broad line of work that spans statistics, machine learning, and geometry.
Statistical distances and probability metrics for multivariate data, ensembles and probability distributions
TLDR
This thesis presents a distance that generalizes the Mahalanobis distance to the case where the distribution of the data is not Gaussian and is able to solve hypothesis tests and classification problems in general contexts, obtaining better results than other standard methods in statistics.
Geometric Sampling of Networks
TLDR
This work makes appeal to three types of discrete curvature, namely the graph Forman-, full Forman- and Haantjes-Ricci curvatures for edge-based and node-based sampling, and considers fitting Ricci flows and employ them for the detection of networks' backbone.
...
...

References

SHOWING 1-10 OF 19 REFERENCES
Comparing distributions and shapes using the kernel distance
TLDR
This paper presents fast approximation algorithms for computing the kernel distance between two point sets P and Q that runs in near-linear time in the size of P ∪ Q (an explicit calculation would take quadratic time).
Hilbertian Metrics and Positive Definite Kernels on Probability Measures
TLDR
The two-parameter family of Hilbertian metrics of Topsoe is extended such that it now includes all commonly used Hilbertian metric on probability measures, which allows to do model selection among these metrics in an elegant and unified way.
From Zero to Reproducing Kernel Hilbert Spaces in Twelve Pages or Less
TLDR
This tutorial attempts to take the reader from a very basic understanding of fields through Banach spaces and Hilbert spaces, into Reproducing Kernel Hilbert Spaces.
Random Features for Large-Scale Kernel Machines
TLDR
Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.
Reproducing kernel Hilbert spaces
  • L. Rosasco
  • Computer Science
    High-Dimensional Statistics
  • 2019
TLDR
The concept of “kernels” will provide us with a flexible, computationally feasible method for implementing Regularization, which requires a (possibly large) class of models and a method for evaluating the complexity of each model in the class.
Matching Shapes Using the Current Distance
TLDR
An interesting aspect of this work is that it can compute the current distance between curves, surfaces, and higher-order manifolds via a simple reduction to instances of weighted point sets, thus obviating the need for different kind of algorithms for different kinds of shapes.
Hilbert Space Embeddings and Metrics on Probability Measures
TLDR
It is shown that the distance between distributions under γk results from an interplay between the properties of the kernel and the distributions, by demonstrating that distributions are close in the embedding space when their differences occur at higher frequencies.
Improved fast gauss transform and efficient kernel density estimation
TLDR
An improved fast Gauss transform is developed to efficiently estimate sums of Gaussians in higher dimensions, where a new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance.
Template estimation form unlabeled point set data and surfaces for Computational Anatomy
A central notion in Computational Anatomy is the generation of registration maps,mapping a large set of anatomical data to a common coordinate system to study intra-population variability and
Distances euclidiennes sur les mesures signées et application à des théorèmes de Berry-Esséen
Starting from an integral representation of reproducing kernels, we dene an inner product on bounded signed measures. The space of measures is embedded in a reproducing kernel Hilbert space and in a
...
...