Hwa Jeong Son

Learn More
A lot of printed documents and books has been published and saved as a form of images in digital libraries. Searching for a specified query word on document images is a challenging problem. The OCR software helps the images to be converted to the machine readable documents to search a full context [1]. Another approach [1, 2] is image-based one, in which(More)
In this paper, we propose a text matching method for document image retrieval without any language model. Two word images are first normalized to an appropriate size and image features are extracted using the local crowdedness method. Similarity between the two features is then measured by calculating a Hausdorff distance. We performed three experiments.(More)
This paper describes a method to extract words from table regions in document images. The proposed approach consists of two stages: cell detection and word extraction. In the cell detection module, a table frame is extracted first by analyzing connected components and then intersection points are detected by a method using masks in the table frame. We(More)
  • 1