# Estimating the Support of a High-Dimensional Distribution

@article{Schlkopf2001EstimatingTS, title={Estimating the Support of a High-Dimensional Distribution}, author={Bernhard Sch{\"o}lkopf and John C. Platt and John Shawe-Taylor and Alex Smola and Robert C. Williamson}, journal={Neural Computation}, year={2001}, volume={13}, pages={1443-1471} }

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a simple subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. [] Key Method The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space.

## 4,989 Citations

### Support Vector Method for Novelty Detection

- Computer ScienceNIPS
- 1999

The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data and is regularized by controlling the length of the weight vector in an associated feature space.

### Learning from positive and unlabeled examples by enforcing statistical significance

- Computer ScienceAISTATS
- 2011

This work formalizes the problem of characterizing the positive class as a problem of learning a feature based score function that minimizes the p-value of a non parametric statistical hypothesis test and provides a solution of this problem computed by a one-class SVM applied on a surrogate dataset obtained.

### 1 Covariate Shift by Kernel Mean Matching

- Computer Science
- 2008

This paper solves the problem of re-weighting the training data such that its distribution more closely matches that of the test data by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space).

### Support Measure Data Description

- Computer Science
- 2014

This work addresses the problem of learning a data description model for a dataset whose elements or observations are itself a set of points in R D by computing a minimum volume set for the probability measures by means of a minimum enclosing ball of the representer functions in a Reproducing Kernel Hilbert Space (RKHS).

### Spectral Regularization for Support Estimation

- Mathematics, Computer ScienceNIPS
- 2010

A new class of regularized spectral estimators based on a new notion of reproducing kernel Hilbert space, which is called "completely regular", which allows to capture the relevant geometric and topological properties of an arbitrary probability space.

### Support Distribution Machines

- Computer ScienceArXiv
- 2012

The projection of the estimated Gram matrix to the cone of semi-definite matrices enables us to employ the kernel trick, and hence use kernel machines for classification, regression, anomaly detection, and low-dimensional embedding in the space of distributions.

### The One Class Support Vector Machine Solution Path

- Computer Science2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
- 2007

A heuristic for enforced nestedness of the sets in the path is introduced, and a method for kernel bandwidth selection based in minimum integrated volume, a kind of AUC criterion is presented.

### A Kernel Two-Sample Test

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2012

This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).

### Learning Minimum Volume Sets with Support Vector Machines

- Computer Science2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing
- 2006

The reduction of MV-set estimation to NP classification to Neyman-Pearson classification is described, improved methods for generating artificial uniform data for the two-class approach are devised, and a new performance measure is advocated for systematic comparison of MV -set algorithms is advocated.

## References

SHOWING 1-10 OF 70 REFERENCES

### On nonparametric estimation of density level sets

- Mathematics
- 1997

Let X 1 ,...,X n be independent identically distributed observations from an unknown probability density f(.). Consider the problem of estimating the level set G = G f (λ) = {x ∈ R 2 : f(x) ≥ λ} from…

### Generalization Performance of Classifiers in Terms of Observed Covering Numbers

- Mathematics, Computer ScienceEuroCOLT
- 1999

It is shown that one can utilize an analogous argument in terms of the observed covering numbers on a single m-sample (being the actual observed data points) to bound the generalization performance of a classifier by using a margin based analysis.

### Detection of Abnormal Behavior Via Nonparametric Estimation of the Support

- Mathematics
- 1980

In this paper two problems are considered, both involving the nonparametric estimation of the support of a random vector from a sequence of independent identically distributed observations. In the…

### Structural Risk Minimization Over Data-Dependent Hierarchies

- Computer ScienceIEEE Trans. Inf. Theory
- 1998

A result is presented that allows one to trade off errors on the training sample against improved generalization performance, and a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets.

### Margin Distribution Bounds on Generalization

- Computer ScienceEuroCOLT
- 1999

It is shown that a slight generalization of their construction can be used to give a pac style bound on the tail of the distribution of the generalization errors that arise from a given sample size.

### Entropy Numbers, Operators and Support Vector Kernels

- Mathematics, Computer ScienceEuroCOLT
- 1999

New bounds for the generalization error of feature space machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by virtue of the eigenvalues of an integral operator induced by the kernel function used by the machine.

### Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators

- MathematicsIEEE Trans. Inf. Theory
- 2001

New bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by using the eigenvalues of an integral operator induced by the kernel function used by the machine.

### Kernel method for percentile feature extraction

- Mathematics
- 2000

A method is proposed which computes a direction in a dataset such that a speci ed fraction of a particular class of all examples is separated from the overall mean by a maximal margin, and this method can be thought of as a robust form of principal component analysis, where instead of variance the authors maximize percentile thresholds.

### Support vector learning

- Computer Science
- 1997

This book provides a comprehensive analysis of what can be done using Support vector Machines, achieving record results in real-life pattern recognition problems, and proposes a new form of nonlinear Principal Component Analysis using Support Vector kernel techniques, which it is considered as the most natural and elegant way for generalization of classical Principal Component analysis.

### Extracting Support Data for a Given Task

- Computer ScienceKDD
- 1995

It is observed that three different types of handwritten digit classifiers construct their decision surface from strongly overlapping small subsets of the data base, which opens up the possibility of compressing data bases significantly by disposing of theData which is not important for the solution of a given task.