An Interior Point Algorithm for Minimum Sum-of-Squares Clustering

@article{Merle1999AnIP,
  title={An Interior Point Algorithm for Minimum Sum-of-Squares Clustering},
  author={Olivier du Merle and Pierre Hansen and Brigitte Jaumard and Nenad Mladenovi{\'c}},
  journal={SIAM J. Sci. Comput.},
  year={1999},
  volume={21},
  pages={1485-1505}
}
An exact algorithm is proposed for minimum sum-of-squares nonhierarchical clustering, i.e., for partitioning a given set of points from a Euclidean m-space into a given number of clusters in order to minimize the sum of squared distances from all points to the centroid of the cluster to which they belong. This problem is expressed as a constrained hyperbolic program in 0-1 variables. The resolution method combines an interior point algorithm, i.e., a weighted analytic center column generation… 

Figures and Tables from this paper

Evaluating a branch-and-bound RLT-based algorithm for minimum sum-of-squares clustering
TLDR
A reformulation-linearization based branch-and-bound algorithm for minimum sum-of-squares clustering, claiming to solve instances with up to 1,000 points, is investigated in further detail, reproducing some of their computational experiments.
An improved column generation algorithm for minimum sum-of-squares clustering
TLDR
This work proposes a new way to solve the auxiliary problem of finding a column with negative reduced cost based on geometric arguments that greatly improves the efficiency of the whole algorithm and leads to exact solution of instances with over 2,300 entities.
An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering
TLDR
This paper presents a new branch-and-bound algorithm for semi-supervised MSSC, where background knowledge is incorporated as pairwise must-link and cannot-link constraints, and efficiently manages to solve real-world instances up to 800 data points with different combinations of must- Links and Cannot- Links.
A heuristic algorithm for solving the minimum sum-of-squares clustering problems
TLDR
This paper uses an auxiliary cluster problem to generate a set of initial points and applies the modified global $$k$$k-means algorithm starting from these points to find the global solution to the clustering problems.
Optimal Partitioning of a Data Set Based on the p-Median Model
TLDR
It is demonstrated that a three-stage procedure consisting of a greedy heuristic, Lagrangian relaxation, and a branch-and-bound algorithm can produce globally optimal solutions for p-median problems of nontrivial size.
Improving spectral bounds for clustering problems by Lagrangian relaxation
TLDR
This paper investigates how to tighten the spectral bounds by using Lagrangian relaxation and Subgradient optimization methods.
Minimum Sum-of-Squares Clustering by DC Programming and DCA
TLDR
A new approach based on DC (Difference of Convex functions) programming and DCA (DC Algorithm) to perform clustering via minimum sum-of-squares Euclidean distance, showing the efficiency of DCA and its great superiority with respect to K-means, a standard method of clustering.
Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections
TLDR
It is shown that when side constraints make k-means inapplicable, the proposed methodology—which is easy and fast to implement and deploy—can obtain good solutions in limited amounts of time.
The Extended Hyperbolic Smoothing Clustering Method
TLDR
The minimum sum-of-squares clustering problem is considered and a smoothing strategy using a special C differentiable class function is proposed, called Hyperbolic Smoothing, which allows the main difficulties presented by the original problem to be overcome.
...
...

References

SHOWING 1-10 OF 52 REFERENCES
Evaluation of a Branch and Bound Algorithm for Clustering
TLDR
A branch and bound algorithm for optimal clustering is developed and applied to a variety of test problems and concludes that the method is practical for problems of up to 100 or so observations if the number of clusters is about 6 or less and the clusters are reasonably well separated.
Decomposition and nondifferentiable optimization with the projective algorithm
TLDR
This paper deals with an application of a variant of Karmarkar's projective algorithm for linear programming to the solution of a generic nondifferentiable minimization problem, based on a column generation technique defining a sequence of primal linear programming maximization problems.
Minimum Sum of Squares Clustering in a Low Dimensional Space
TLDR
An exact polynomial algorithm, with a complexity in O(Np+1 logN), is proposed for minimum sum of squares hierarchical divisive clustering of points in a p-dimensional space with small p.
Bicriterion Cluster Analysis
  • M. Delattre, P. Hansen
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 1980
TLDR
It is shown that the problem of determining a partition into a given number of clusters with minimum diameter or with maximum split can be solved by the classical single-link clustering algorithm and by a graph-theoretic algorithm involving the optimal coloration of a sequence of partial graphs.
Variable neighborhood search
Integer Programming and the Theory of Grouping
Abstract This paper is written with three objectives in mind. First, to point out that the problem of grouping, where a larger number of elements n are combined into m mutually exclusive groups (m <
An Algorithm for Euclidean Sum of Squares Classification
TLDR
The problem is reformulated itn non-linear programming terms, and a new algorithm for seeking the minimum sum of squared distances about the g centroids is described, and an efficient hybrid algorithmi is introduced.
Cluster Analysis and Mathematical Programming
TLDR
Cluster analysis involves the problem of optimal partitioning of a given set of entities into a pre-assigned number of mutually exclusive and exhaustive clusters that lead to different kinds of linear and non-linear integer programming problems.
Application of weighted Voronoi diagrams and randomization to variance-based k-clustering
  • M. Inaba
  • Computer Science, Mathematics
    SoCG 1994
  • 1994
In this paper we consider the k-clustering problem for a set S of n points pi = (~i) in the d-dimensional space with variance-based errors as clustering criteria, motivated from the color
On Nonlinear Fractional Programming
The main purpose of this paper is to delineate an algorithm for fractional programming with nonlinear as well as linear terms in the numerator and denominator. The algorithm presented is based on a
...
...