Learn More
MOTIVATION Predicting the secondary structure of a protein (alpha-helix, beta-sheet, coil) is an important step towards elucidating its three-dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network architectures with a fixed, and relatively short, input window of(More)
—In this paper, we describe a flexible form-reader system capable of extracting textual information from accounting documents, like invoices and bills of service companies. In this kind of document, the extraction of some information fields cannot take place without having detected the corresponding instruction fields, which are only constrained to range in(More)
We propose a novel uniied approach for integrating explicit knowledge and learning by example in recurrent networks. The explicit knowledge is represented by automaton rules, which are directly injected into the connections of a network. This can be accomplished by using a technique based on linear programming, instead of learning from random initial(More)
In this paper we propose an adaptive model, referred to as Recursive Neural Networks (RRNNs) for logo recognition by explicitly conveying logo item into m-ary tree representation , where symbolic and sub-symbolic information coexist. Each node in the contour-tree is associated with an exterior or interior contour extracted from the logo instance. A feature(More)
We describe an approach for table location in document images. The documents are described by means of a hierarchical representation that is based on the MXY tree. The presence of a table is hypothesized by searching parallel lines in the MXY tree of the page. This hypothesis is af-terwards verified by locating perpendicular lines or white spaces in the(More)
Nowadays, Digital Libraries have become a widely used service to store and share both digital born documents and digital versions of works stored by traditional libraries. Document images are intrinsically non-structured and the structure and semantic of the digitized documents is in most part lost during the conversion. Several techniques related to the(More)
Text categorization is typically formulated as a concept learning prob lem where each instance is a single isolated document. In this paper we are interested in a more general formulation where documents are organized as page sequences, as naturally occurring in digital libraries of scanned books and magazines. We describe a method for classifying pages of(More)
In this paper we focus on methods for injecting prior knowledge into adaptive recurrent networks for sequence processing. In order to increase the exibility needed for specifying partially known rules, we propose a nondeterministic approach for modeling domain knowledge. The algorithms presented in this paper allow to map time-warping nondeterministic(More)