Exploiting parallelism to support scalable hierarchical clustering

  title={Exploiting parallelism to support scalable hierarchical clustering},
  author={Rebecca Cathey and Eric C. Jensen and Steven M. Beitzel and Ophir Frieder and David A. Grossman},
A distributed memory parallel version of the group average H ierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to l arge collections. Using standard message passing operations reduces interprocess communication while maintaini ng efficient load balancing. In a series of experiments using a subset of a standard TREC test collection, our parallel hie rarchical clustering algorithm is shown to be scalable in terms of processors efficiently… CONTINUE READING


Publications referenced by this paper.
Showing 1-10 of 45 references

Efficient Parallel H i rarchical Clustering

  • M. Dash, S. Petrutiu, P. Sheuermann
  • In International Europar Conference
  • 2004
Highly Influential
3 Excerpts

Similar Papers

Loading similar papers…