Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web

@inproceedings{An2003AutomaticAO,
  title={Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web},
  author={Joohui An and Seungwoo Lee and Gary Geunbae Lee},
  booktitle={ACL},
  year={2003}
}
In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems. We use an NE list and an web search engine to collect web documents which contain the NE instances. The documents are refined through sentence separation and text refinement procedures and NE instances are finally tagged with the appropriate NE categories. Our experiments demonstrates that the suggested method can acquire… CONTINUE READING
Highly Cited
This paper has 40 citations. REVIEW CITATIONS