Parallel implementation of K-Means clustering algorithm based on mapReduce computing model of hadoop

  title={Parallel implementation of K-Means clustering algorithm based on mapReduce computing model of hadoop},
  author={Hongbo Xu and Nianmin Yao and Qilong Han and Haiwei Pan},
In recent years, data clustering has been studied extensively and a lot of methods and theories have been achieved. However, with the development of the database and the popularity of Internet, a lot of new challenges such as Big Data and Cloud Computing lie in the research on data clustering. The paper presents a parallel k-means clustering algorithm based on MapReduce computing model of Hadoop platform. The MapReduce computing model has two phases: a map phase and a reduce phase. The map… CONTINUE READING


Publications citing this paper.


Publications referenced by this paper.
Showing 1-10 of 13 references

Data mining Concepts and Techniques

Jiawei Han, Micheline Kamber, Jian Pei
Third Edition. China Machine Press, • 2012

Research on parallel K-Medoids algorithm based on multi-core platform

Li Jingbin, Yang Liu, Hua Bei
Application Research of Computers, • 2011
View 2 Excerpts

Adranale D.(2009) Toolkit-based high-performance data mining of large data on MapReduce clusters

D Wegener, M Mock
IEEE International Conference on Data Mining, Washington: IEEE, • 2009
View 1 Excerpt

Data mining platform WEKA and secondary development on WEKA

Chen Hui-ping, Lin Li-li, Wang Jian-dong, Miao Xinrui
Computer Engineering and Applications, • 2008
View 1 Excerpt

The Study of Parallel Clustering Algorithm for Cluster System

Zhou Bing, Feng Zhonghui, Wang Hexing
Computer Science • 2007

Parallel clustering algorithm based on MPI

Zhao Zongtang, Sun Shenli, Fan Ji
Journal of Zhengzhou Institute of Aeronautical Industry Management, • 2005
View 1 Excerpt

Similar Papers

Loading similar papers…