Learning a concept-based document similarity measure

Abstract

Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that… (More)
DOI: 10.1002/asi.22689

Topics

14 Figures and Tables

Statistics

01020201520162017
Citations per Year

Citation Velocity: 11

Averaging 11 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@article{Huang2012LearningAC, title={Learning a concept-based document similarity measure}, author={Anna-Lan Huang and David N. Milne and Eibe Frank and Ian H. Witten}, journal={JASIST}, year={2012}, volume={63}, pages={1593-1608} }