A Lexical Database for Modern Standard Arabic Interoperable with a Finite State Morphological Transducer

@inproceedings{Attia2011ALD,
  title={A Lexical Database for Modern Standard Arabic Interoperable with a Finite State Morphological Transducer},
  author={Mohammed Attia and Pavel Pecina and Antonio Toral and Lamia Tounsi and Josef van Genabith},
  booktitle={SFCM},
  year={2011}
}
Current Arabic lexicons, whether computational or otherwise, make no distinction between entries from Modern Standard Arabic (MSA) and Classical Arabic (CA), and tend to include obsolete words that are not attested in current usage. We address this problem by building a large-scale, corpus-based lexical database that is representative of MSA. We use an MSA corpus of 1,089,111,204 words, a pre-annotation tool, machine learning techniques, and knowledge-based templatic matching to automatically… CONTINUE READING

References

Publications referenced by this paper.
Showing 1-10 of 41 references

Buckwalter Arabic Morphological Analyzer (BAMA) Version 2.0. Linguistic Data Consortium (LDC) catalogue

T. Buckwalter
2004
View 6 Excerpts
Highly Influenced

Multilingual resources for NLP in the lexical markup framework (LMF)

Language Resources and Evaluation • 2009
View 11 Excerpts
Highly Influenced

Dictionary of Modern Written Arabic, pp

H. Wehr, J. M. Cowan
VII-XV. Spoken Language Services, Ithaca • 1976
View 4 Excerpts
Highly Influenced

Foma: a Finite-State Compiler and Library

EACL • 2009
View 3 Excerpts
Highly Influenced

Finite State Morphology: CSLI studies in computational linguistics

K. R. Beesley, L. Karttunen
CSLI, Stanford • 2003
View 4 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…