Estimating the Support of a High-Dimensional Distribution

@article{Schlkopf2001EstimatingTS,
  title={Estimating the Support of a High-Dimensional Distribution},
  author={Bernhard Sch{\"o}lkopf and John C. Platt and John Shawe-Taylor and Alex Smola and Robert C. Williamson},
  journal={Neural Computation},
  year={2001},
  volume={13},
  pages={1443-1471}
}
Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a simple subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. [...] Key Method The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space.Expand
Support Vector Method for Novelty Detection
TLDR
The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data and is regularized by controlling the length of the weight vector in an associated feature space.
Learning from positive and unlabeled examples by enforcing statistical significance
  • P. Geurts
  • Computer Science, Mathematics
    AISTATS
  • 2011
TLDR
This work formalizes the problem of characterizing the positive class as a problem of learning a feature based score function that minimizes the p-value of a non parametric statistical hypothesis test and provides a solution of this problem computed by a one-class SVM applied on a surrogate dataset obtained.
1 Covariate Shift by Kernel Mean Matching
Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this
Exact rates in density support estimation
Let f be an unknown multivariate probability density with compact support S"f. Given n independent observations X"1,...,X"n drawn from f, this paper is devoted to the study of the estimator [email
Support Measure Data Description
TLDR
This work addresses the problem of learning a data description model for a dataset whose elements or observations are itself a set of points in R D by computing a minimum volume set for the probability measures by means of a minimum enclosing ball of the representer functions in a Reproducing Kernel Hilbert Space (RKHS).
Spectral Regularization for Support Estimation
TLDR
A new class of regularized spectral estimators based on a new notion of reproducing kernel Hilbert space, which is called "completely regular", which allows to capture the relevant geometric and topological properties of an arbitrary probability space.
Support Distribution Machines
TLDR
The projection of the estimated Gram matrix to the cone of semi-definite matrices enables us to employ the kernel trick, and hence use kernel machines for classification, regression, anomaly detection, and low-dimensional embedding in the space of distributions.
The One Class Support Vector Machine Solution Path
  • Gyemin Lee, C. Scott
  • Mathematics, Computer Science
    2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
  • 2007
TLDR
A heuristic for enforced nestedness of the sets in the path is introduced, and a method for kernel bandwidth selection based in minimum integrated volume, a kind of AUC criterion is presented.
A Kernel Two-Sample Test
TLDR
This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).
Learning Minimum Volume Sets with Support Vector Machines
Given a probability law P on d-dimensional Euclidean space, the minimum volume set (MV-set) with mass beta, 0 < beta < 1, is the set with smallest volume enclosing a probability mass of at least
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 107 REFERENCES
On nonparametric estimation of density level sets
Let X 1 ,...,X n be independent identically distributed observations from an unknown probability density f(.). Consider the problem of estimating the level set G = G f (λ) = {x ∈ R 2 : f(x) ≥ λ} from
Generalization Performance of Classifiers in Terms of Observed Covering Numbers
TLDR
It is shown that one can utilize an analogous argument in terms of the observed covering numbers on a single m-sample (being the actual observed data points) to bound the generalization performance of a classifier by using a margin based analysis.
Detection of Abnormal Behavior Via Nonparametric Estimation of the Support
In this paper two problems are considered, both involving the nonparametric estimation of the support of a random vector from a sequence of independent identically distributed observations. In the
Learning Distributions by Their Density Levels: A Paradigm for Learning without a Teacher
TLDR
It is proved that classes whose VC-dimension is finite are learnable in a very strong sense, while on the other hand,�-covering numbers of a concept class impose lower bounds on the sample size needed for learning in the authors' models.
Structural Risk Minimization Over Data-Dependent Hierarchies
TLDR
A result is presented that allows one to trade off errors on the training sample against improved generalization performance, and a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets.
A plug-in approach to support estimation
We suggest a new approach, based on the use of density estimators, for the problem of estimating the (compact) support of a multivariate density. This subject (motivated in terms of pattern analysis
Margin Distribution Bounds on Generalization
TLDR
It is shown that a slight generalization of their construction can be used to give a pac style bound on the tail of the distribution of the generalization errors that arise from a given sample size.
Entropy Numbers, Operators and Support Vector Kernels
TLDR
New bounds for the generalization error of feature space machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by virtue of the eigenvalues of an integral operator induced by the kernel function used by the machine.
Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators
TLDR
New bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by using the eigenvalues of an integral operator induced by the kernel function used by the machine.
Kernel method for percentile feature extraction
TLDR
A method is proposed which computes a direction in a dataset such that a speci ed fraction of a particular class of all examples is separated from the overall mean by a maximal margin, and this method can be thought of as a robust form of principal component analysis, where instead of variance the authors maximize percentile thresholds.
...
1
2
3
4
5
...