Learn More
We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n 3) time. More surprisingly, the representation is extended naturally to non-projective parsing using Chu-Liu-Edmonds (Chu(More)
Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a(More)
The availability of annotated data (with as rich and " deep " annotation as possible) is desirable in any new developments. Textual data are being used for so-called training phase of various empirical methods solving various problems in the field of computational linguistics. While there are many methods that use texts in their plain (or raw) form (in most(More)
For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syntactic and semantic dependencies in multiple languages.(More)
This paper considers statistical parsing of Czech, which differs radically from English in at least two respects: (1) it is a highly inflected language, and (2) it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model(More)
Automatic natural language processing of large texts often presents recurring challenges in multiple languages: even for most advanced tasks, the texts are first processed by basic processing steps – from tokenization to parsing. We present an extremely simple-to-use tool consisting of one binary and one model (per language), which performs these tasks for(More)
Prague Arabic Dependency Treebank not only consists of multi-level linguistic annotations over the language of Modern Standard Arabic, but even provides a variety of unique software implementations designed for general use in Natural Language Processing (NLP). This paper delivers an overview of the recent and most interesting results, findings and(More)