Machine Translation of Arabic Dialects

@inproceedings{Zbib2012MachineTO,
  title={Machine Translation of Arabic Dialects},
  author={Rabih Zbib and Erika Malchiodi and Jacob Devlin and David Stallard and Spyridon Matsoukas and Richard M. Schwartz and John Makhoul and Omar Zaidan and Chris Callison-Burch},
  booktitle={HLT-NAACL},
  year={2012}
}
Arabic Dialects present many challenges for machine translation, not least of which is the lack of data resources. We use crowdsourcing to cheaply and quickly build LevantineEnglish and Egyptian-English parallel corpora, consisting of 1.1M words and 380k words, respectively. The dialectal sentences are selected from a large corpus of Arabic web text, and translated using Amazon’s Mechanical Turk. We use this data to build Dialectal Arabic MT systems, and find that small amounts of dialectal… CONTINUE READING
Highly Influential
This paper has highly influenced 10 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 128 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • When translating Egyptian and Levantine test sets, our Dialectal Arabic MT system performs 6.3 and 7.0 BLEU points higher than a Modern Standard Arabic MT system trained on a 150M-word Arabic-English parallel corpus.

Citations

Publications citing this paper.
Showing 1-10 of 93 extracted citations

Evaluating English to Arabic machine translators

2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT) • 2013
View 3 Excerpts
Highly Influenced

128 Citations

0102030'13'15'17'19
Citations per Year
Semantic Scholar estimates that this publication has 128 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 24 references

, Kevin Knight , and Wei Wang . 2009 . 11 , 001 new features for statistical machine translation

Jacob Devlin
NAACL ’ 09 : Proceedings of the 2009 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics , Boulder , Colorado . • 2009
View 1 Excerpt

, and Wei Wang . 2009 . 11 , 001 new features for statistical machine translation

Jacob Devlin
NAACL ’ 09 : Proceedings of the 2009 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics , Boulder , Colorado . • 2009
View 1 Excerpt

Similar Papers

Loading similar papers…