Chun Lei He

Learn More
—We propose a novel word spotting system for Urdu words within handwritten text lines. Spatial information of diacritics is integrated to the detection of the main connected components in candidate words generation. An Urdu word recognition system is effectively designed and applied to classify the candidate words. In this word recognition system, compound(More)
This paper presents a Linear Discriminant Analysis based Measurement (LDAM) on the output from classifiers as a criterion to reject the patterns which cannot be classified with high reliability. This is important in applications (such as in processing of financial documents) where errors can be very costly and therefore less tolerable than rejections. To(More)
A new large Urdu handwriting database, which includes isolated digits, numeral strings with/without decimal points, five special symbols, 44 isolated characters, 57 Urdu words (mostly financial related), and Urdu dates in different patterns, was designed at Centre for Pattern Recognition and Machine Intelligence (CENPARMI). It is the first database for Urdu(More)
—Since the Urdu language has more isolated letters than Arabic and Farsi, a research on Urdu handwritten word is desired. This is a novel approach to use the compound features and a Support Vector Machine (SVM) in offline Urdu word recognition. Due to the cursive style in Urdu, a classification using a holistic approach is adapted efficiently. Compound(More)
In document recognition, it is often important to obtain high accuracy or reliability and to reject patterns that cannot be classified with high confidence. This is the case for applications such as the processing of financial documents in which errors can be very costly and therefore far less tolerable than rejections. This paper presents a new approach(More)
—In order to spot the digits in a handwritten document, each component is sent to a classifier. This is a time consuming process because a document usually contains several hundred components. A method is presented to reduce the number of candidate components from a handwritten document sent to the classifier. Furthermore, since the classifier does not(More)
  • 1