Learn More
In this paper, we present a new text line extraction method for handwritten Arabic documents. The proposed technique is based on a generalized adaptive local con-nectivity map (ALCM) using a steerable directional filter. The algorithm is designed to solve the particularly complex problems seen in handwritten documents such as fluctuating , touching or(More)
This paper describes a novel recognition driven segmentation methodology for Devanagari Optical Character Recognition. Prior approaches have used sequential rules to segment characters followed by template matching for classification. Our method uses a graph representation to segment characters. This method allows us to segment horizontally or vertically(More)
This paper presents an algorithm using adaptive local connectivity map for retrieving text lines from the complex handwritten documents such as handwritten historical manuscripts. The algorithm is designed for solving the particularly complex problems seen in handwritten documents. These problems include fluctuating text lines, touching or crossing text(More)
In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of(More)
In this paper we present a top-down, projection-profile based algorithm to separate text blocks from image blocks in a Devanagari document. We use a distinctive feature of Devanagari text, called Shirorekha (Header Line) to analyze the pattern produced by Devanagari text in the horizontal profile. The horizontal profile corresponding to a text block(More)
We outline two different techniques for OCR of machine printed, multi-font Devanagari text. In the first design, words are segmented along linear boundaries. Subsequently, classification is performed with the assumption of accurate segmentation. The second approach uses classifiers to obtain preliminary hypothesis for each segment of the word. These results(More)
Separating machine printed text and handwriting from overlapping text is a challenging problem in the document analysis field and no reliable algorithms have been developed thus far. In this paper, we propose a novel approach for separating handwriting from binary image of overlapped text. Instead of using fixed size training patches, we describe an(More)