Shape complexity in cluster analysis
@article{Aguilar2022ShapeCI, title={Shape complexity in cluster analysis}, author={Eduardo Jes{\'u}s Aguilar and Valmir Carneiro Barbosa}, journal={ArXiv}, year={2022}, volume={abs/2205.08046} }
In cluster analysis, a common first step is to scale the data aiming to better partition them into clusters. Even though many different techniques have throughout many years been introduced to this end, it is probably fair to say that the workhorse in this preprocessing phase has been to divide the data by the standard deviation along each dimension. Like the standard deviation, the great majority of scaling techniques can be said to have roots in some sort of statistical take on the data. Here…
References
SHOWING 1-10 OF 26 REFERENCES
{m
- GeologyACML
- 2020
The master programme in Applied Geology aims to provide comprehensive knowledge based on various branches of Geology, with special focus on Applied geology subjects in the areas of Geomorphology, Structural geology, Hydrogeology, Petroleum Geologists, Mining Geology), Remote Sensing and Environmental geology.
Pooled variable scaling for cluster analysis
- Computer ScienceBioinform.
- 2020
This work proposes a new approach for scaling prior to cluster analysis based on the concept of pooled variance and uses this approach to cluster a high dimensional genomic dataset consisting of gene expression data for several specimens of breast cancer cells tissue obtained from human patients.
A study of standardization of variables in cluster analysis
- Computer Science
- 1988
The present simulation study examined the standardization problem and found that those approaches which standardize by division by the range of the variable gave consistently superior recovery of the underlying cluster structure.
Weighted Standardization—A General Data Transformation Method Proceeding Classification Procedures
- Computer Science
- 1986
During preparatory steps of data for automatic classification routines, the amount of information contained by the character distribution is reduced by standardization of the character values. This…
Optimal variable weighting for hierarchical clustering: An alternating least-squares algorithm
- Computer Science
- 1985
A new methodology which simultaneously estimates in a least-squares fashion both an ultrametric tree and respective variable weightings for profile data that have been converted into (weighted) Euclidean distances is presented.
Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables
- Computer Science
- 1984
A new method is proposed (SYNCLUS, SYNthesizedCLUStering) for dealing with the problem of how can the various contributory variables in a specific battery be weighted so as to enhance some cluster structure that may be present.
On comparing partitions
- Mathematics
- 2015
Rand (1971) proposed the Rand Index to measure the stability of two partitions of one set of units. Hubert and Arabie (1985) corrected the Rand Index for chance (Adjusted Rand Index). In this paper,…