Unsupervised Feature Selection for the $k$-means Clustering Problem


We present a novel feature selection algorithm for the k-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ε ∈ (0, 1), selects and appropriately rescales in an unsupervised manner Θ(k log(k/ε)/ε) features from a dataset of arbitrary dimensions. We prove that, if we run any γ-approximate k-means algorithm (γ ≥ 1) on the features selected using our method, we can find a (1+ (1+ ε)γ)-approximate partition with high probability.

Extracted Key Phrases

2 Figures and Tables

Citations per Year

76 Citations

Semantic Scholar estimates that this publication has 76 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Boutsidis2009UnsupervisedFS, title={Unsupervised Feature Selection for the \$k\$-means Clustering Problem}, author={Christos Boutsidis and Michael W. Mahoney and Petros Drineas}, booktitle={NIPS}, year={2009} }