Annotating Documents by Wikipedia Concepts

@article{Schnhofen2008AnnotatingDB,
  title={Annotating Documents by Wikipedia Concepts},
  author={P{\'e}ter Sch{\"o}nhofen},
  journal={2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology},
  year={2008},
  volume={1},
  pages={461-467}
}
We present a technique which is able to reliably label words or phrases of an arbitrary document with Wikipedia articles (concepts) best describing their meaning. First it scans the document content, and when it finds a word sequence matching the title of a Wikipedia article, it attaches the article to the constituent word(s). The collected articles are then scored based on three factors: (1) how many other detected articles they semantically relate to, according to the Wikipedia link structure… CONTINUE READING

Figures, Tables, and Topics from this paper.

Explore Further: Topics Discussed in This Paper