Efficient phrase-based document indexing for Web document clustering

@article{Hammouda2004EfficientPD,
  title={Efficient phrase-based document indexing for Web document clustering},
  author={Khaled M. Hammouda and Mohamed S. Kamel},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2004},
  volume={16},
  pages={1279-1296}
}
Document clustering techniques mostly rely on single term analysis of the document data set, such as the vector space model. To achieve more accurate document clustering, more informative features including phrases and their weights are particularly important in such scenarios. Document clustering is particularly useful in many applications such as automatic categorization of documents, grouping search engine results, building a taxonomy of documents, and others. This article presents two key… CONTINUE READING
Highly Influential
This paper has highly influenced a number of papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 335 citations. REVIEW CITATIONS

16 Figures & Tables

Topics

Statistics

02040'05'06'07'08'09'10'11'12'13'14'15'16'17'18
Citations per Year

336 Citations

Semantic Scholar estimates that this publication has 336 citations based on the available data.

See our FAQ for additional information.