Learn More
Several hybrid disambiguation methods are described which combine the strength of handwritten disambiguation rules and statistical taggers. Three different statistical (HMM, Maximum-Entropy and Averaged Perceptron) taggers are used in a tagging experiment using Prague Dependency Tree-bank. The results of the hybrid systems are better than any other method(More)
We present a modiied version of the Transformation-Based Approach (TBA) and Transformation-Based Error-Driven Learning (TBEDL). We mod-iied the TBA in order to work with a dependency tree structure, which describes more eeciently the syntax of innective and free-word order languages, such as the Czech language. The major changes and characteristics are(More)
In this paper we present UIMA – the Unstructured Information Management Architecture, an architecture and software framework for creating, discovering, composing and deploying a broad range of multi-modal analysis capabilities and integrating them with search technologies. We describe the elementary components of the framework and how they are deployed into(More)
Evaluating Optical Music Recognition (OMR) is notoriously difficult and automated end-to-end OMR evaluation metrics are not available to guide development. In " Towards a Standard Testbed for Optical Music Recognition: Definitions, Metrics, and Page Images " , Byrd and Simon-sen recently stress that a benchmarking standard is needed in the OMR community,(More)
We describe results of investigation of a specific type of discontinuous constructions, namely non-projective constructions concerning verbs and their arguments. This topic is especially important for languages with a relatively free word order, such as Czech, which is the language we have primarily worked with. For comparison, we have included some results(More)
We describe a method for semi-automatic extraction of Slovak multiword expressions (MWEs) from a dependency treebank. The process uses an automatic conversion from dependency syntactic trees to deep syntax and automatic tagging of verbal argument nodes based on a valency dictionary. Both the valency dictionary and the treebank conversion were adapted from(More)
  • 1