Hongxi Wei

  • Citations Per Year
Learn More
This paper presents a new method to recognize machine-printed traditional Mongolian characters by using back-propagation (BP) neural networks. First, the set of traditional Mongolian characters is divided into five subsets according to each character’s position (initial, medial or final) within a word and some steady structural features. Then, each subset(More)
In order to recognize and retrieve the Mongolian Kanjur images, lots of preprocessing tasks should be done. In this paper, we concentrate on the binarization of the Mongolian Kanjur images and we have proposed an efficient binarization method for them. The proposed method is applied to each image as follows: First, some preprocessing tasks including(More)
There are many classical Mongolian historical documents which are reserved in image form, and as a result it is difficult for us to explore and retrieve them. In this paper, we investigate the peculiarities of classical Mongolian documents and propose an approach to recognize the words in them. We design an algorithm to segment the Mongolian words into(More)
In this paper, we propose a keyword retrieval system for locating words in historical Mongolian document images. Based on the word spotting technology, a collection of historical Mongolian document images is converted into a collection of word images by word segmentation, and a number of profile-based features are extracted to represent word images. For(More)
According to characteristics of Mongolian wordformation, a method for removing inflectional suffixes from word images of the Mongolian Kanjur is proposed in this paper. By removing inflectional suffixes, the amount of clusters equivalent to indexing terms might be reduced in word spotting. For the above purpose, we need to determine whether or not one word(More)
This paper proposes a Bag of Visual Words (BoVW) based approach for keyword spotting on the Mongolian historical document images. In this paper, the first step is dividing the scanned Mongolian historical document images into word images by some preprocessing steps, such as connected component analysis, binarization etc. Then, all of image in our training(More)