Parallel k means Clustering Algorithm on SMP

  title={Parallel k means Clustering Algorithm on SMP},
  author={Athari M. Alrajhi and Soha S. Zaghloul},
  journal={International journal of new computer architectures and their applications},
The k-means clustering algorithm is one of the popular and simplest clustering algorithms. Due to its simplicity, it is widely used in many applications. Although k-means has low computational time and space complexity, increasing the dataset size results in increasing the computational time proportionally. One of the most prominent solutions to deal with this problem is the parallel processing. In this paper, we aim to design and implement a parallel k-means clustering algorithm on shared… 

Modeling and Clustering using Hypergraph

The purpose of this paper is to model text using hypergraph and apply the morphological operator on hypergraph created from the underlying text to get text clusters.

Automated classification of behavioural and electrophysiological data in neuroscience

The results presented here show that machine learning algorithms and parallel processing architectures are both fundamental tools for coping with large and complex data sets, like the ones found in modern neuroscience.

IoT data processing pipeline in FoF perspective

...................................................................................................................................... ix Resumo…



Parallel K-means Clustering Algorithm on NOWs

This paper shows an improvement by a factor of O(K/2) by applying theories of parallel computing to the parallel version of K-means algorithm, which enables the algorithm to run on larger collective memory of multiple machines when the memory of a single machine is insufficient to solve a problem.

Parallel K-Means Algorithm for Shared Memory Multiprocessors

The aim of this work is to provide theoretical analysis on the performance of k-means algorithm and presenting extensive tests on a shared memory architecture.

Parallel k-Means++ for Multiple Shared-Memory Architectures

This paper presents a parallelization of the exact k-means++ algorithm, with a proof of its correctness, and develops implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform.

A Parallel Implementation of K-Means Clustering on GPUs

This paper introduces a first step towards building an efficient GPU-based parallel implementation of a commonly used clustering algorithm called K-Means on an NVIDIA G80 PCI express graphics board using the CUDA processing extensions.

Parallel Computing: Performance Metrics and Models

It is laid out the minimum requirements that a model for parallel computers should meet before it can be considered acceptable and the BSP and LogP models are considered and the importance of the speciics of the interconnect topology in developing good parallel algorithms pointed out.

Parallel K-Means Algorithm on Agricultural Databases

This work implemented parallel k-means algorithm for cluster large dataset, which takes agricultural datasets because of limited researches are done in agricultural field.

Inital Starting Point Analysis for K-Means Clustering: A Case Study

It is shown that the results of the running the k-means algorithm on the same workload will vary depending on the values chosen as initial starting points.

BIRCH: A New Data Clustering Algorithm and Its Applications

An efficient and scalable data clustering method is proposed, based on a new in-memory data structure called CF-tree, which serves as an in- memory summary of the data distribution, and implemented in a system called BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies), and compared with other available methods.

Big CPU, Big Data: Solving the World's Toughest Computational Problems with Parallel Computing

In the twenty-first century, scientists and engineers are tackling the world's toughest computational problems with parallel computing. Using multiple processor cores running simultaneously, parallel…

Data clustering: 50 years beyond K-means

A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.