k-means-: A Unified Approach to Clustering and Outlier Detection

  title={k-means-: A Unified Approach to Clustering and Outlier Detection},
  author={Sanjay Chawla and Aristides Gionis},
We present a unified approach for simultaneously clustering and discovering outliers in data. Our approach is formalized as a generalization of the k-means problem. We prove that the problem is NP-hard and then present a practical polynomial time algorithm, which is guaranteed to converge to a local optimum. Furthermore we extend our approach to all distance measures that can be expressed in the form of a Bregman divergence. Experiments on synthetic and real datasets demonstrate the… CONTINUE READING
Highly Cited
This paper has 85 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • In particular on the famous KDD cup network-intrusion dataset, we were able to increase the precision of the outlier detection task by nearly 100% compared to the classical nearest-neighbor approach.


Publications citing this paper.
Showing 1-10 of 58 extracted citations

Clustering with Outlier Removal

ArXiv • 2018
View 4 Excerpts
Highly Influenced

RODS: Rarity based Outlier Detection in a Sparse Coding Framework

IEEE Transactions on Knowledge and Data Engineering • 2016
View 6 Excerpts
Highly Influenced

Simultaneous clustering and outlier detection using dominant sets

2016 23rd International Conference on Pattern Recognition (ICPR) • 2016
View 4 Excerpts
Highly Influenced

K-means Clustering with Outlier Removal

Pattern Recognition Letters • 2017
View 4 Excerpts
Highly Influenced

85 Citations

Citations per Year
Semantic Scholar estimates that this publication has 85 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…