Unsupervised Feature Selection for the $k$-means Clustering Problem

Abstract

We present a novel feature selection algorithm for the k-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ε ∈ (0, 1), selects and appropriately rescales in an unsupervised manner Θ(k log(k/ε)/ε) features from a dataset of arbitrary dimensions. We prove that, if we run any γ-approximate k-means algorithm (γ ≥ 1) on the features selected using our method, we can find a (1+ (1+ ε)γ)-approximate partition with high probability.

Extracted Key Phrases

2 Figures and Tables

010203020072008200920102011201220132014201520162017
Citations per Year

76 Citations

Semantic Scholar estimates that this publication has 76 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Boutsidis2009UnsupervisedFS, title={Unsupervised Feature Selection for the \$k\$-means Clustering Problem}, author={Christos Boutsidis and Michael W. Mahoney and Petros Drineas}, booktitle={NIPS}, year={2009} }