Learn More
We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n 3) time. More surprisingly, the representation is extended naturally to non-projective parsing using Chu-Liu-Edmonds (Chu(More)
For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syntactic and semantic dependencies in multiple languages.(More)
This paper considers statistical parsing of Czech, which differs radically from English in at least two respects: (1) it is a highly inflected language, and (2) it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model(More)
A hybrid system is described which combines the strength of manual rule-writing and statistical learning, obtaining results superior to both methods if applied separately. The combination of a rule-based system and a statistical one is not parallel but serial: the rule-based system performing partial disambigua-tion with recall close to 100% is applied(More)
Part of Speech tagging for English seems to have reached the the human levels of error, but full morphological tagging for inflectionally rich languages, such as Romanian, Czech, or Hungarian, is still an open problem, and the results are far from being satisfactory. This paper presents results obtained by using a universalized exponential feature-based(More)
This paper describes POS tagging experiments with semi-supervised training as an extension to the (supervised) averaged perceptron algorithm, first introduced for this task by (Collins, 2002). Experiments with an iterative training on standard-sized supervised (manually annotated) dataset (10 6 tokens) combined with a relatively modest (in the order of 10 8(More)
We introduce a substantial update of the Prague Czech-English Dependency Treebank, a parallel corpus manually annotated at the deep syntactic layer of linguistic representation. The English part consists of the Wall Street Journal (WSJ) section of the Penn Treebank. The Czech part was translated from the English source sentence by sentence. This paper gives(More)
We define broad-coverage semantic dependency parsing (SDP) as the task of recovering sentence-internal predicate–argument relationships for all content words, i.e. the semantic structure constituting the rela-tional core of sentence meaning. Syntactic dependency parsing has seen great advances in the past decade, in part owing to relatively broad consensus(More)