Sheikh Faisal Rashid

Learn More
—Recurrent neural networks (RNN) have been successfully applied for recognition of cursive handwritten documents , both in English and Arabic scripts. Ability of RNNs to model context in sequence data like speech and text makes them a suitable candidate to develop OCR systems for printed Nabataean scripts (including Nastaleeq for which no OCR system is(More)
—Optical character recognition (OCR) of machine printed Latin script documents is ubiquitously claimed as a solved problem. However, error free OCR of degraded or noisy text is still challenging for modern OCR systems. Most recent approaches perform segmentation based character recognition. This is tricky because segmentation of degraded text is itself(More)
—Segmentation and recognition of screen rendered text is a challenging task due to its low resolution (72 or 96 ppi) and use of anti-aliased rendering. This paper evaluates Hidden Markov Model (HMM) techniques for OCR of low resolution text–both on screen rendered isolated characters and screen rendered text-lines–and compares it with the performance of(More)
—Cursive handwriting recognition is still a hot topic of research, especially for non-Latin scripts. One of the techniques which yields best recognition results is based on recurrent neural networks: with neurons modeled by long short-term memory (LSTM) cells, and alignment of label sequence to output sequence performed by a connectionist temporal(More)
OCR of multi-font Arabic text is difficult due to large variations in character shapes from one font to another. It becomes even more challenging if the text is rendered at very low resolution. This paper describes a multi-font, low resolution, and open vocabulary OCR system based on a multidimensional recurrent neural network architecture. For this work,(More)
A large amount of real-world data is required to train and benchmark any character recognition algorithm. Developing a page-level ground-truth database for this purpose is overwhelmingly laborious, as it involves a lot of manual efforts to produce a reasonable database that covers all possible words of a language. Moreover, generating such a database for(More)
Document script recognition is one of the important pre-processing steps in a multilingual optical character recognition (MOCR) system. A MOCR system requires prior knowledge of script to accurately recognize multilingual text in a single document. In multilingual documents two scripts can be mixed together within a single text line. Many existing script(More)
In current study we examine how letter permutation affects in visual recognition of words for two orthographically dissimilar languages, Urdu * and German. We present the hypothesis that recognition or reading of permuted and non-permuted words are two distinct mental level processes, and that people use different strategies in handling permuted words as(More)