Learn More
This article proposes a novel approach on how to rectify the photo image of the bound document. The surface of the document is modeled by a cylindrical surface. By the geometry of camera image formation, the equations using the cue of directrixes to map the points on the surface in the 3-D scene to the points on the image plane are achieved. Baselines of(More)
A model based approach for rectifying the camera image of the bound document has been developed, i.e., the surface of the document is represented by a general cylindrical surface. The principle of using the model to unwrap the image is discussed. Practically, the skeleton of each horizontal text line is extracted to help estimate the parameter of the model,(More)
This paper proposes a statistical approach to degraded handwritten form image preprocessing including binarization and form line removal. The degraded image is modeled by a Markov random field (MRF) where the prior is learnt from a training set of high quality binarized images, and the probabilistic density is learnt on-the-fly from the gray-level histogram(More)
Many feature extraction approaches for off-line handwriting recognition (OHR) rely on accurate binarization of gray-level images. However, high-quality binarization of most real-world documents is extremely difficult due to varying characteristics of noises artifacts common in such documents. Unlike most of these features, Gabor features do not require(More)
This paper presents a statistical approach to the preprocessing of degraded handwritten forms including the steps of binarization and form line removal. The degraded image is modeled by a Markov random field (MRF) where the hidden-layer prior probability is learned from a training set of high-quality binarized images and the observation probability density(More)
Offline handwriting recognition of free-flowing Arabic text is a challenging task due to the plethora of factors that contribute to the variability in the data. In this paper, we address some of these sources of variability, and present experimental results on a large corpus of handwritten documents. Specific techniques such as the application of(More)
A vector model based information retrieval of handwritten medical forms is presented in this paper. In order to improve the IR performance on the erroneous output of handwriting recognition (HR) systems, a variation of the vector model is made to estimate the number of occurrences of terms from word segmentation and recognition probabilities. IR Tests show(More)
Despite several decades of research in document analysis, recognition of unconstrained handwritten documents is still considered a challenging task. Previous research in this area has shown that word recognizers produce reasonably clean output when used with a restricted lexicon. But in absence of such a restricted lexicon, the output of an unconstrained(More)
—Handwritten text line segmentation on real-world data presents significant challenges that cannot be overcome by any single technique. Given the diversity of approaches and the recent advances in ensemble-based combination for pattern recognition problems, it is possible to improve the segmentation performance by combining the outputs from different line(More)