Learn More
Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary(More)
—Handwritten word spotting aims at making document images amenable to browsing and searching by keyword retrieval. In this paper, we present a word spotting system based on Hidden Markov Models (HMM) that uses trained subword models to spot keywords. With the proposed method, arbitrary keywords can be spotted that do not need to be present in the training(More)
—Segmenting page images into text lines is a crucial pre-processing step for automated reading of historical documents. Challenging issues in this open research field are given e.g. by paper or parchment background noise, ink bleed-through, artifacts due to aging, stains, and touching text lines. In this paper, we present a novel binarization-free line(More)
For retrieving keywords from scanned handwritten documents, we present a word spotting system that is based on character Hidden Markov Models. In an efficient lexicon-free approach, arbitrary keywords can be spotted without pre-segmenting text lines into words. For a multi-writer scenario on the IAM off-line database as well as for two single writer(More)
—Spotting keywords in handwritten documents without transcription is a valuable method as it allows one to search, index, and classify such documents. In this paper we show that keyword spotting based on bidirectional Long Short-Term Memory (BLSTM) recurrent neural nets can successfully be applied on online handwritten documents with non-text content. It(More)
—Automatic transcription of historical documents is vital for the creation of digital libraries. In this paper we propose graph similarity features as a novel descriptor for handwriting recognition in historical documents based on Hidden Markov Models. Using a structural graph-based representation of text images, a sequence of graph similarity features is(More)
Handwriting recognition in historical documents is vital for the creation of digital libraries. The creation of readily available ground truth data plays a central role for the development of new recognition technologies. For historical documents, ground truth creation is more difficult and time-consuming when compared with modern documents. In this paper,(More)
Building recognition systems for historical documents is a difficult task. Especially, when it comes to medieval scripts. The complexity is mainly affected by the poor quality and the small quantity of the data available. In this paper we apply an HMM based recognition system to medieval manuscripts from the 13th century written in Middle High German. The(More)