Estimating the Support of a High-Dimensional Distribution

Abstract

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

DOI: 10.1162/089976601750264965

Extracted Key Phrases

7 Figures and Tables

0100200300'00'02'04'06'08'10'12'14'16
Citations per Year

3,185 Citations

Semantic Scholar estimates that this publication has 3,185 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Schlkopf2001EstimatingTS, title={Estimating the Support of a High-Dimensional Distribution}, author={Bernhard Sch{\"{o}lkopf and John C. Platt and John Shawe-Taylor and Alexander J. Smola and Robert C. Williamson}, journal={Neural computation}, year={2001}, volume={13 7}, pages={1443-71} }