Feature-Based Tagger of Approximations of Functional Arabic Morphology

Abstract

The field of morphological disambiguation of Arabic has recently witnessed significant achievements (Habash and Rambow [15], Smith et al. [28]). Through them, the Penn Arabic Treebank (PATB, Maamouri et al. [24]) is being confirmed as a standard for development and evaluation of systems for automatic morphological processing of Arabic, and the Buckwalter Arabic Morphological Analyzer (Buckwalter [6, 7]) is becoming the most respected lexical resource of its kind. The context for understanding the current paper has evolved since our work on it started, yet, the motivation for it is unchanged and the conclusions are valid and up-to-date. We would like to open some issues concerning the very description of Arabic morphology and point out that in this domain, one should carefully distinguish individual problems, theories, resources, and solutions for their frequent idiosyncrasies and incompatibilities. In this contribution, we reference Functional Arabic Morphology (Smrž [29]) and take the Buckwalter Morphology as the departure point for approximating this novel model by (a) restoring the true syntactic units (b) seeking their functional, rather than structural, morphological categories. We then present five versions of a feature-based morphological tagger depending on that approximation, which were built on all the currently available Parts of PATB, as well as on the MorphoTrees annotations of the Prague Arabic Dependency Treebank (PADT, Hajič et al. [18]).

Extracted Key Phrases

5 Figures and Tables

Cite this paper

@inproceedings{Hajic2005FeatureBasedTO, title={Feature-Based Tagger of Approximations of Functional Arabic Morphology}, author={Jan Hajic and Otakar Smrz and Tim Buckwalter and Hubert Jin}, year={2005} }