Word Segmentation of Informal Arabic with Domain Adaptation

  title={Word Segmentation of Informal Arabic with Domain Adaptation},
  author={Will Monroe and Spence Green and Christopher D. Manning},
Segmentation of clitics has been shown to improve accuracy on a variety of Arabic NLP tasks. However, state-of-the-art Arabic word segmenters are either limited to formal Modern Standard Arabic, performing poorly on Arabic text featuring dialectal vocabulary and grammar, or rely on linguistic knowledge that is hand-tuned for each dialect. We extend an existing MSA segmenter with a simple domain adaptation technique and new features in order to segment informal and dialectal Arabic text… CONTINUE READING
Highly Cited
This paper has 47 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • Experiments show that our system outperforms existing systems on broadcast news and Egyptian dialect, improving segmentation F1 score on a recently released Egyptian Arabic corpus to 92.09%, compared to 91.60% for another segmenter designed specifically for Egyptian Arabic.


Publications citing this paper.
Showing 1-10 of 35 extracted citations

Similar Papers

Loading similar papers…