Automatically generated, phonemic Arabic-IPA pronunciation tiers for the boundary annotated Qur’an dataset for machine learning (version 2.0)

@inproceedings{Sawalha2014AutomaticallyGP,
  title={Automatically generated, phonemic Arabic-IPA pronunciation tiers for the boundary annotated Qur’an dataset for machine learning (version 2.0)},
  author={Majdi Sawalha and Claire Brierley and Eric Atwell},
  year={2014}
}
In this paper, we augment the Boundary Annotated Qur’an dataset published at LREC 2012 (Brierley et al 2012; Sawalha et al 2012a) with automatically generated phonemic transcriptions of Arabic words. We have developed and evaluated a comprehensive grapheme-phoneme mapping from Standard Arabic > IPA (Brierley et al under review), and implemented the mapping in Arabic transcription technology which achieves 100% accuracy as measured against two gold standards: one for Qur’anic or Classical Arabic… CONTINUE READING