Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web

  title={Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web},
  author={Joohui An and Seungwoo Lee and Gary Geunbae Lee},
In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems. We use an NE list and an web search engine to collect web documents which contain the NE instances. The documents are refined through sentence separation and text refinement procedures and NE instances are finally tagged with the appropriate NE categories. Our experiments demonstrates that the suggested method can acquire… CONTINUE READING
Highly Cited
This paper has 58 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 27 extracted citations

Boosted Web Named Entity Recognition via Tri-Training

ACM Trans. Asian & Low-Resource Lang. Inf. Process. • 2016
View 3 Excerpts
Highly Influenced

59 Citations

Citations per Year
Semantic Scholar estimates that this publication has 59 citations based on the available data.

See our FAQ for additional information.