# Standardization and Its Effects on K-Means Clustering Algorithm

@article{Mohamad2013StandardizationAI, title={Standardization and Its Effects on K-Means Clustering Algorithm}, author={I. Mohamad and D. Usman}, journal={Research Journal of Applied Sciences, Engineering and Technology}, year={2013}, volume={6}, pages={3299-3303} }

Data clustering is an important data exploration technique with many applications in data mining. [...] Key Result By comparing the results on infectious diseases datasets, it was found that the result obtained by the z-score standardization method is more effective and efficient than min-max and decimal scaling standardization methods. Expand

#### 201 Citations

Review on Optimal Data Analysis Based on New Projection-Based K-Means Initialization Clustering Algorithm

- Computer Science
- 2019

The proposed formula initial use standard mathematician kernel density estimation technique to search out the extremely density information areas in one dimension and iteratively use density estimation from the lower variance dimensions to the upper variance ones till all the scale square measure computed. Expand

K-MEANS CLUSTERING AlGORITHM USING INITIALIZATION AND NORMALIZATION METHODS

- 2018

Data clustering is the technique of clustering the data into different groups and these formed groups are known as Clusters in Data mining [1]. Data elements are clustered into different groups based… Expand

Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm

- Computer Science
- 2017

The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Expand

International Journal of Scientific Research in Computer Science, Engineering and Information Technology

- 2018

Data Mining is the technique used to mine the data that is finding the useful information from the raw data. As day-by-day data is increasing it becomes difficult for us to analyzing such a huge… Expand

Application of the k-medoids Partitioning Algorithm for Clustering of Time Series Data

- Computer Science
- 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe)
- 2020

A comprehensive analysis of the applicability of a standard clustering algorithm, the k-medoids algorithm, for clustering of two diverse time series datasets, on dynamic power responses of a hybrid renewable energy source plant and neuroscience spike-train data. Expand

Comparative Performance Analysis of K-Means and DBSCAN Clustering algorithms on various platforms

- Computer Science
- 2019 22nd International Conference on Computer and Information Technology (ICCIT)
- 2019

In this study, K-means and DBSCAN clustering algorithm is performed on selected datasets in the four most popular platforms- Python, Matlab, R and Wolfram Mathematica to find that algorithm takes different execution time in different platform. Expand

Classification Performance Improvement Using Random Subset Feature Selection Algorithm for Data Mining

- Computer Science
- Big Data Res.
- 2018

An attempt is made to improve the existing RSFS algorithm's performance for dimensionality reduction and increase its stability and the improved algorithm is superior in reducing the dimensionality and improving the classification accuracy when used with a simple kNN classifier. Expand

Application of Agglomerative Hierarchical Clustering for Clustering of Time Series Data

- Computer Science
- 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe)
- 2020

Investigation of the performance of the standard agglomerative hierarchical clustering algorithm using two time series datasets from electric power system and neuroscience area shows that the effectiveness of the clustering algorithms is affected to a large extent by the main characteristics of the clusters data and algorithm’s parameters. Expand

Avoiding common pitfalls when clustering biological data

- Computer Science, Medicine
- Science Signaling
- 2016

This article reviews common pitfalls identified from the published molecular biology literature and presents methods to avoid them, and discusses ensemble clustering as an easy-to-implement method that enables the exploration of multiple clustering Solutions and improves robustness of clustering solutions. Expand

A modified self-updating clustering algorithm for application to dengue gene expression data

- Computer Science, Mathematics
- Commun. Stat. Simul. Comput.
- 2021

It was demonstrated that the proposed approach does not require the priori number of clusters and the convergence of the proposed algorithm was proved, and the algorithm was superior to other compared algorithms. Expand

#### References

SHOWING 1-9 OF 9 REFERENCES

Impact of Outlier Removal and Normalization Approach in Modified k-Means Clustering Algorithm

- Computer Science
- 2011

This paper analyzed the performance of modified k-Means clustering algorithm with data preprocessing technique includes cleaning method, normalization approach and outlier detection with automatic initialization of seed values on datasets from UCI dataset repository. Expand

Data clustering: a review

- Computer Science
- CSUR
- 1999

An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. Expand

A study of standardization of variables in cluster analysis

- Mathematics
- 1988

A methodological problem in applied clustering involves the decision of whether or not to standardize the input variables prior to the computation of a Euclidean distance dissimilarity measure.… Expand

Data Mining: A Preprocessing Engine

- Computer Science
- 2006

This study emphasized on different types of normalization, each of which was tested against the ID3 methodology using the HSV data set, and recommendations were concluded on the best normalization method based on the factors and their priorities. Expand

Discovering Knowledge in Data: An Introduction to Data Mining

- Computer Science
- 2004

The second edition of a highly praised, successful reference on data mining, with thorough coverage of big data applications, predictive analytics, and statistical analysis.Includes new chapters on… Expand

Data Mining: A Knowledge Discovery Approach

- Computer Science
- 2007

This comprehensive textbook on data mining details the unique steps of the knowledge discovery process that prescribes the sequence in which data mining projects should be performed, from problem and… Expand

Feature normalization and likelihood-based similarity measures for image retrieval

- Mathematics, Computer Science
- Pattern Recognit. Lett.
- 2001

The effects of five feature normalization methods on retrieval performance are discussed and two likelihood ratio-based similarity measures that perform significantly better than the commonly used geometric approaches like the Lp metrics are described. Expand

Impact of normalization in distributed K-means clustering

- Int. J. Soft Comput.,
- 2009