Recovering the number of clusters in data sets with noise features using feature rescaling factors
@article{Amorim2015RecoveringTN, title={Recovering the number of clusters in data sets with noise features using feature rescaling factors}, author={Renato Cordeiro de Amorim and Christian Hennig}, journal={Inf. Sci.}, year={2015}, volume={324}, pages={126-145} }
Figures and Tables from this paper
261 Citations
Penalized k-means algorithms for finding the correct number of clusters in a dataset
- Computer ScienceArXiv
- 2019
This paper derives, for the case of ideal clusters, rigorous bounds for the coefficient of the additive penalty, and empirically investigates certain types of deviations from ideal cluster assumption and shows that combination of k-means with additive and multiplicative penalties can resolve ambiguous solutions.
Penalized K-Means Algorithms for Finding the Number of Clusters
- Computer Science, Mathematics2020 25th International Conference on Pattern Recognition (ICPR)
- 2021
This paper derives rigorous bounds for the coefficient of the additive penalty in k-means for ideal clusters, which generally produces a more reliable signature, compared to additive penalty, for the correct number of clusters in cases where the ideal cluster assumption holds.
A-Wardpβ: Effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation
- Computer ScienceInf. Sci.
- 2016
I-nice: A new approach for identifying the number of clusters and initial cluster centres
- Computer ScienceInf. Sci.
- 2018
A Survey on Feature Weighting Based K-Means Algorithms
- Computer ScienceJ. Classif.
- 2016
This paper elaborates on the concept of feature weighting and addresses these issues by critically analyzing some of the most popular, or innovative, feature Weighting mechanisms based in K-Means.
Unsupervised feature selection with multi-subspace randomization and collaboration
- Computer ScienceKnowl. Based Syst.
- 2019
A New Assessment of Cluster Tendency Ensemble approach for Data Clustering
- Computer ScienceSoICT 2018
- 2018
An improved SACT method for data clustering, called eSACT algorithm, which exhibited high performance, reliability and accuracy compared to previous proposed algorithms in the assessment of cluster tendency.
A hierarchical Gamma Mixture Model-based method for estimating the number of clusters in complex data
- Computer ScienceAppl. Soft Comput.
- 2020
References
SHOWING 1-10 OF 36 REFERENCES
Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering
- Computer SciencePattern Recognit.
- 2012
Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads
- Computer ScienceJ. Classif.
- 2010
An experimental setting is proposed for comparison of different approaches at data generated from Gaussian clusters with the controlled parameters of between- and within-cluster spread to model cluster intermix to evaluate the centroid recovery on par with conventional evaluation of the cluster recovery.
Experiments for the Number of Clusters in K-Means
- Computer ScienceEPIA Workshops
- 2007
An adjusted iK-Means method is proposed, which performs well in the current experiment setting and is compared to the least squares and least modules version of an intelligent version of the method by Mirkin.
An examination of procedures for determining the number of clusters in a data set
- Computer Science
- 1994
The aim of this paper is to compare three methods based on the hypervolume criterion with four other well-known methods for determining the number of clusters on artificial data sets.
K-means clustering: a half-century synthesis.
- Computer ScienceThe British journal of mathematical and statistical psychology
- 2006
This paper synthesizes the results, methodology, and research conducted concerning the K-means clustering method over the last fifty years, leading to a unifying treatment of K-Means and some of its extensions.
Some new indexes of cluster validity
- Computer ScienceIEEE Trans. Syst. Man Cybern. Part B
- 1998
This work reviews two clustering algorithms and three indexes of crisp cluster validity and shows that while Dunn's original index has operational flaws, the concept it embodies provides a rich paradigm for validation of partitions that have cloud-like clusters.
Data Clustering: 50 Years Beyond K-means
- Computer ScienceECML/PKDD
- 2008
The practice of classifying objects according to perceived similarities is the basis for much of science. Organizing data into sensible groupings is one of the most fundamental modes of understanding…
On Initializations for the Minkowski Weighted K-Means
- Computer ScienceIDA
- 2012
It is found that the Ward method in the Minkowski space tends to outperform other initializations, with the exception of low-dimensional Gaussian Models with noise features.
On comparing partitions
- Mathematics
- 2015
Rand (1971) proposed the Rand Index to measure the stability of two partitions of one set of units. Hubert and Arabie (1985) corrected the Rand Index for chance (Adjusted Rand Index). In this paper,…
Automated variable weighting in k-means type clustering
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2005
A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed, and the convergency theorem of the new clustered process is given.