• Corpus ID: 52956330

Improvement of K Mean Clustering Algorithm Based on Density

@article{Chang2018ImprovementOK,
  title={Improvement of K Mean Clustering Algorithm Based on Density},
  author={Su Chang and Zhenzong Xu and Gao Xuan},
  journal={ArXiv},
  year={2018},
  volume={abs/1810.04559}
}
The purpose of this paper is to improve the traditional K-means algorithm. In the traditional K mean clustering algorithm, the initial clustering centers are generated randomly in the data set. It is easy to fall into the local minimum solution when the initial cluster centers are randomly generated. The initial clustering center selected by K-means clustering algorithm which based on density is more representative. The experimental results show that the improved K clustering algorithm can… 

Figures and Tables from this paper

ODBOT: Outlier detection-based oversampling technique for imbalanced datasets learning

TLDR
A simple but effective outlier detection-based oversampling technique (ODBOT) is proposed to handle the multi-class imbalance problem and can reduce the risk of the overlapping problem among different class regions and can build a better classification model.

Anomaly Detection in Hospital Claims Using K-Means and Linear Regression

TLDR
This research aims to determine the anomaly of hospital claim to BPJS Kesehatan using a combination of K-Means algorithm and Linear Regression and it is hoped that with the identification of these anomaly points, more effective follow-up can be carried out.

References

SHOWING 1-10 OF 22 REFERENCES

An efficient k'-means clustering algorithm

Clustering by fast search and find of density peaks

TLDR
A method in which the cluster centers are recognized as local density maxima that are far away from any points of higher density, and the algorithm depends only on the relative densities rather than their absolute values.

In search of optimal clusters using genetic algorithms

Data mining in soft computing framework: a survey

TLDR
A survey of the available literature on data mining using soft computing based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model is provided.

Some methods for classification and analysis of multivariate observations

The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give

Towards multicriteria clustering: An extension of the k

ii))=dist(ordrho(ii),ordrho(jj))

    Y,Vazigiannis M.Cluster validity methods: part I [J].SIGMOD Rec.2002,31(2):40-45

    • 2002

    To ensure that DN is the larger one in first two columns of the maximum value, and count it as the total number of data points