# Estimating the Support of a High-Dimensional Distribution

@article{Schlkopf2001EstimatingTS, title={Estimating the Support of a High-Dimensional Distribution}, author={Bernhard Sch{\"o}lkopf and John C. Platt and John Shawe-Taylor and Alex Smola and Robert C. Williamson}, journal={Neural Computation}, year={2001}, volume={13}, pages={1443-1471} }

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a simple subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. [...] Key Method The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. Expand

## Topics from this paper

## 4,568 Citations

Support Vector Method for Novelty Detection

- Computer ScienceNIPS
- 1999

The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data and is regularized by controlling the length of the weight vector in an associated feature space.

Learning from positive and unlabeled examples by enforcing statistical significance

- Computer Science, MathematicsAISTATS
- 2011

This work formalizes the problem of characterizing the positive class as a problem of learning a feature based score function that minimizes the p-value of a non parametric statistical hypothesis test and provides a solution of this problem computed by a one-class SVM applied on a surrogate dataset obtained.

1 Covariate Shift by Kernel Mean Matching

- 2008

Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this…

Exact rates in density support estimation

- Mathematics
- 2008

Let f be an unknown multivariate probability density with compact support S"f. Given n independent observations X"1,...,X"n drawn from f, this paper is devoted to the study of the estimator [email…

Support Measure Data Description

- Computer Science
- 2014

This work addresses the problem of learning a data description model for a dataset whose elements or observations are itself a set of points in R D by computing a minimum volume set for the probability measures by means of a minimum enclosing ball of the representer functions in a Reproducing Kernel Hilbert Space (RKHS).

Spectral Regularization for Support Estimation

- Computer Science, MathematicsNIPS
- 2010

A new class of regularized spectral estimators based on a new notion of reproducing kernel Hilbert space, which is called "completely regular", which allows to capture the relevant geometric and topological properties of an arbitrary probability space.

Support Distribution Machines

- Computer ScienceArXiv
- 2012

The projection of the estimated Gram matrix to the cone of semi-definite matrices enables us to employ the kernel trick, and hence use kernel machines for classification, regression, anomaly detection, and low-dimensional embedding in the space of distributions.

The One Class Support Vector Machine Solution Path

- Mathematics, Computer Science2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
- 2007

A heuristic for enforced nestedness of the sets in the path is introduced, and a method for kernel bandwidth selection based in minimum integrated volume, a kind of AUC criterion is presented.

A Kernel Two-Sample Test

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2012

This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).

Learning Minimum Volume Sets with Support Vector Machines

- Mathematics2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing
- 2006

Given a probability law P on d-dimensional Euclidean space, the minimum volume set (MV-set) with mass beta, 0 < beta < 1, is the set with smallest volume enclosing a probability mass of at least…

## References

SHOWING 1-10 OF 107 REFERENCES

On nonparametric estimation of density level sets

- Mathematics
- 1997

Let X 1 ,...,X n be independent identically distributed observations from an unknown probability density f(.). Consider the problem of estimating the level set G = G f (λ) = {x ∈ R 2 : f(x) ≥ λ} from…

Generalization Performance of Classifiers in Terms of Observed Covering Numbers

- Mathematics, Computer ScienceEuroCOLT
- 1999

It is shown that one can utilize an analogous argument in terms of the observed covering numbers on a single m-sample (being the actual observed data points) to bound the generalization performance of a classifier by using a margin based analysis.

Detection of Abnormal Behavior Via Nonparametric Estimation of the Support

- Mathematics
- 1980

In this paper two problems are considered, both involving the nonparametric estimation of the support of a random vector from a sequence of independent identically distributed observations. In the…

Learning Distributions by Their Density Levels: A Paradigm for Learning without a Teacher

- Computer ScienceJ. Comput. Syst. Sci.
- 1997

It is proved that classes whose VC-dimension is finite are learnable in a very strong sense, while on the other hand,�-covering numbers of a concept class impose lower bounds on the sample size needed for learning in the authors' models.

Structural Risk Minimization Over Data-Dependent Hierarchies

- Computer ScienceIEEE Trans. Inf. Theory
- 1998

A result is presented that allows one to trade off errors on the training sample against improved generalization performance, and a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets.

A plug-in approach to support estimation

- Mathematics
- 1997

We suggest a new approach, based on the use of density estimators, for the problem of estimating the (compact) support of a multivariate density. This subject (motivated in terms of pattern analysis…

Margin Distribution Bounds on Generalization

- Mathematics, Computer ScienceEuroCOLT
- 1999

It is shown that a slight generalization of their construction can be used to give a pac style bound on the tail of the distribution of the generalization errors that arise from a given sample size.

Entropy Numbers, Operators and Support Vector Kernels

- Mathematics, Computer ScienceEuroCOLT
- 1999

New bounds for the generalization error of feature space machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by virtue of the eigenvalues of an integral operator induced by the kernel function used by the machine.

Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators

- Mathematics, Computer ScienceIEEE Trans. Inf. Theory
- 2001

New bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by using the eigenvalues of an integral operator induced by the kernel function used by the machine.

Kernel method for percentile feature extraction

- Computer Science
- 2000

A method is proposed which computes a direction in a dataset such that a speci ed fraction of a particular class of all examples is separated from the overall mean by a maximal margin, and this method can be thought of as a robust form of principal component analysis, where instead of variance the authors maximize percentile thresholds.