Exploiting parallelism to support scalable hierarchical clustering

@article{Cathey2007ExploitingPT,
  title={Exploiting parallelism to support scalable hierarchical clustering},
  author={Rebecca Cathey and Eric C. Jensen and Steven M. Beitzel and Ophir Frieder and David A. Grossman},
  journal={JASIST},
  year={2007},
  volume={58},
  pages={1207-1221}
}
A distributed memory parallel version of the group average H ierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to l arge collections. Using standard message passing operations reduces interprocess communication while maintaini ng efficient load balancing. In a series of experiments using a subset of a standard TREC test collection, our parallel hie rarchical clustering algorithm is shown to be scalable in terms of processors efficiently… CONTINUE READING

References

Publications referenced by this paper.
Showing 1-10 of 45 references

Efficient Parallel H i rarchical Clustering

  • M. Dash, S. Petrutiu, P. Sheuermann
  • In International Europar Conference
  • 2004
Highly Influential
3 Excerpts

Similar Papers

Loading similar papers…