Size Matters: Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization

@article{Rujeerapaiboon2019SizeMC,
  title={Size Matters: Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization},
  author={Napat Rujeerapaiboon and K. Schindler and D. Kuhn and W. Wiesemann},
  journal={SIAM J. Optim.},
  year={2019},
  volume={29},
  pages={1211-1239}
}
  • Napat Rujeerapaiboon, K. Schindler, +1 author W. Wiesemann
  • Published 2019
  • Mathematics, Computer Science
  • SIAM J. Optim.
  • Plain vanilla K-means clustering has proven to be successful in practice, yet it suffers from outlier sensitivity and may produce highly unbalanced clusters. To mitigate both shortcomings, we formulate a joint outlier detection and clustering problem, which assigns a prescribed number of datapoints to an auxiliary outlier cluster and performs cardinality-constrained K-means clustering on the residual dataset, treating the cluster cardinalities as a given input. We cast this problem as a mixed… CONTINUE READING
    13 Citations
    Cluster Analysis is Convex
    • PDF
    Consistent k-Median: Simpler, Better and Robust
    • 1
    • PDF
    On Controlling the Size of Clusters in Probabilistic Clustering
    • 2
    • PDF
    Improved Conic Reformulations for K-means Clustering
    • 3
    • PDF
    On Convex Hulls of Epigraphs of QCQPs
    • 4
    • PDF
    Multi-Dimensional Processing for Big Data with Noise
    • Y. Shi
    • 2019 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS)
    • 2019

    References

    SHOWING 1-10 OF 43 REFERENCES
    Relax, No Need to Round: Integrality of Clustering Formulations
    • 83
    • Highly Influential
    • PDF
    Constrained K-Means Clustering
    • 257
    • Highly Influential
    • PDF
    Recovery guarantees for exemplar-based clustering
    • 26
    • PDF
    Size Regularized Cut for Data Clustering
    • 18
    • PDF
    Approximating K-means-type Clustering via Semidefinite Programming
    • 147
    • Highly Influential
    • PDF
    Similarity clustering in the presence of outliers: Exact recovery via convex program
    • 9
    On Integrated Clustering and Outlier Detection
    • 36
    • PDF
    Size Constrained Distance Clustering: Separation Properties and Some Complexity Results
    • 11
    • PDF
    Probably certifiably correct k-means clustering
    • 27
    • PDF