Learn More
In the Chinese handwriting recognition competition organized with the ICDAR 2011, four tasks were evaluated: offline and online isolated character recognition, offline and online handwritten text recognition. To enable the training of recognition systems, we announced the large databases CASIA-HWDB/OLHWDB. The submitted systems were evaluated on un-open(More)
Recently, the Institute of Automation of Chinese Academy of Sciences (CASIA) released the uncon-strained online and offline Chinese handwriting databases CASIA-OLHWDB and CASIA-HWDB, which contain isolated character samples and handwritten texts produced by 1020 writers. This paper presents our benchmarking results using state-of-the-art methods on the(More)
—This paper introduces a pair of online and offline Chinese handwriting databases, containing samples of isolated characters and handwritten texts. The samples were produced by 1,020 writers using Anoto pen on papers for obtaining both online trajectory data and offline images. Both the online samples and offline samples are divided into six datasets, three(More)
This paper presents an effective approach for the offline recognition of unconstrained handwritten Chinese texts. Under the general integrated segmentation-and-recognition framework with character oversegmentation, we investigate three important issues: candidate path evaluation, path search, and parameter estimation. For path evaluation, we combine(More)
Separating text lines in unconstrained handwritten documents remains a challenge because the handwritten text lines are often un-uniformly skewed and curved, and the space between lines is not obvious. In this paper, we propose a novel text line segmentation algorithm based on minimal spanning tree (MST) clustering with distance metric learning. Given a(More)
This paper describes a system for handwritten Chinese text recognition integrating language model. On a text line image, the system generates character segmentation and word segmentation candidates, and the candidate paths are evaluated by character recognition scores and language model. The optimal path, giving segmentation and recognition result, is found(More)
Text line extraction from unconstrained handwritten documents is a challenge because the text lines are often skewed and curved and the space between lines is not obvious. To solve this problem, we propose an approach based on minimum spanning tree (MST) clustering with new distance measures. First, the connected components of the document image are grouped(More)
—This paper presents a robust method for handwritten text line extraction. We use morphological dilation with a dynamic adaptive mask for line extraction. Line separation occurs because of the repulsion and attraction between connected components. The characteristics of the Arabic script are considered to ensure a high performance of the algorithm. Our(More)