A complete KALDI recipe for building Arabic speech recognition systems

  title={A complete KALDI recipe for building Arabic speech recognition systems},
  author={Ahmed M. Ali and Yifan Zhang and Patrick Cardinal and Najim Dehak and Stephan Vogel and James R. Glass},
  journal={2014 IEEE Spoken Language Technology Workshop (SLT)},
In this paper we present a recipe and language resources for training and testing Arabic speech recognition systems using the KALDI toolkit. We built a prototype broadcast news system using 200 hours GALE data that is publicly available through LDC. We describe in detail the decisions made in building the system: using the MADA toolkit for text normalization and vowelization; why we use 36 phonemes; how we generate pronunciations; how we build the language model. We report results using state… CONTINUE READING
Highly Cited
This paper has 38 citations. REVIEW CITATIONS

4 Figures & Tables



Citations per Year

Citation Velocity: 10

Averaging 10 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.