Learn More
In this paper, we present a new text line extraction method for handwritten Arabic documents. The proposed technique is based on a generalized adaptive local con-nectivity map (ALCM) using a steerable directional filter. The algorithm is designed to solve the particularly complex problems seen in handwritten documents such as fluctuating , touching or(More)
This paper describes a novel recognition driven segmentation methodology for Devanagari Optical Character Recognition. Prior approaches have used sequential rules to segment characters followed by template matching for classification. Our method uses a graph representation to segment characters. This method allows us to segment horizontally or vertically(More)
This paper presents an algorithm using adaptive local connectivity map for retrieving text lines from the complex handwritten documents such as handwritten historical manuscripts. The algorithm is designed for solving the particularly complex problems seen in handwritten documents. These problems include fluctuating text lines, touching or crossing text(More)
In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of(More)
Palm leaves were one of the earliest forms of writing media and their use as writing material in South and Southeast Asia has been recorded from as early as the fifth century B.C. until as recently as the late 19th century. Palm leaf manuscripts relating to art and architecture, mathematics, astronomy, astrology, and medicine dating back several hundreds of(More)
In this paper we present a top-down, projection-profile based algorithm to separate text blocks from image blocks in a Devanagari document. We use a distinctive feature of Devanagari text, called Shirorekha (Header Line) to analyze the pattern produced by Devanagari text in the horizontal profile. The horizontal profile corresponding to a text block(More)
We outline two different techniques for OCR of machine printed, multi-font Devanagari text. In the first design, words are segmented along linear boundaries. Subsequently, classification is performed with the assumption of accurate segmentation. The second approach uses classifiers to obtain preliminary hypothesis for each segment of the word. These results(More)