Learn More
Discriminating handwritten and printed text is a challenging task in an arbitrary orientation scenario. The task gets even tougher when the text content is by nature sparse in the document, e.g. in torn document pieces. We here propose a system for discriminating handwritten and printed text in the context of sparse data and arbitrary orientation. A(More)
In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character Recognition (OCR) of such a document page it is better to identify, at first, Thai and Roman script portions and then to use individual OCR systems of the respective scripts on these identified portions. In this(More)
—Script identification is an essential step for the efficient use of the appropriate OCR in multilingual document images. There are various techniques available for script identification from printed and handwritten document images, but script identification from video frames has not been explored much. This paper presents a study of some pre-processing(More)
A two-stage approach for word-wise identification of English (Roman), Devnagari and Bengali (Bangla) scripts is proposed. This approach balances the tradeoff between recognition accuracy and processing speed. The 1st stage allows identifying scripts with high speed, yet less accuracy when dealing with noisy data. The advanced 2nd stage processes only those(More)
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be always ensured. Also, compromising in terms of systems reliability under such situation is not desirable. We here propose a(More)
There are many documents in Srilanka where a single document page may contain Sinhala, Tamil and English texts. For OCR development of such a document page, it is better to identify different scripts present in the page and then feed the identified portion to the respective OCR module. In this paper, a SVM based technique is proposed for word-wise(More)