Learn More
We report on the creation of a database composed of images of Arabic Printed words. The purpose of this database is the large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style text recognition systems in Arabic. The challenges that are addressed by the database are in the variability of the sizes, fonts and style used to generate(More)
In this paper, we propose a new font and size identification method for ultra-low resolution Arabic word images using a stochastic approach. The literature has proved the difficulty for Arabic text recognition systems to treat multi-font and multi-size word images. This is due to the variability induced by some font family, in addition to the inherent(More)
This first competition used the freely available Arabic Printed Text Image (APTI) database. Several research groups have started using the APTI database and this year, 2 groups with 3 systems are participating in the competition. The systems are compared using the recognition rates at the character and word levels. The systems were tested on one test(More)
We analyze in this paper the impact of sub-models choice for automatic Arabic printed text recognition based on Hidden Markov Models (HMM). In our approach, sub-models correspond to characters shapes assembled to compose words models. One of the peculiarities of Arabic writing is to present various character shapes according to their position in the word.(More)
—Standard databases play essential roles for evaluating and comparing results obtained by different groups of researchers. In this paper, an Arabic Handwritten Text Images Database written by Multiple Writers (AHTID/MW) is introduced. This database can be used for research in the recognition of Arabic handwritten text with open vocabulary, word segmentation(More)
In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word(More)