Automatic Creation of Arabic Named Entity Annotated Corpus Using Wikipedia

@inproceedings{Althobaiti2014AutomaticCO,
  title={Automatic Creation of Arabic Named Entity Annotated Corpus Using Wikipedia},
  author={Maha Althobaiti and Udo Kruschwitz and Massimo Poesio},
  booktitle={EACL},
  year={2014}
}
In this paper we propose a new methodology to exploit Wikipedia features and structure to automatically develop an Arabic NE annotated corpus. Each Wikipedia link is transformed into an NE type of the target article in order to produce the NE annotation. Other Wikipedia features namely redirects, anchor texts, and inter-language links are used to tag additional NEs, which appear without links in Wikipedia texts. Furthermore, we have developed a filtering algorithm to eliminate ambiguity when… CONTINUE READING