Learn More
The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the(More)
This paper reports on the first shared task on statistical parsing of morphologically rich languages (MRLs). The task features data sets from nine languages, each available both in constituency and dependency annotation. We report on the preparation of the data sets, on the proposed parsing scenarios, and on the evaluation metrics for parsing MRLs given(More)
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it to the independently developed TIGER treebank (Brants et al., 2002). Both treebanks, TIGER and TüBa-D/Z, use an annotation framework that is based on phrase structure grammar and that is enhanced by a level of predicate-argument structure. The comparison(More)
This stylebook is an updated version of Telljohann et al. (2006). It describes the design principles and the annotation scheme for the German treebank TüBa-D/Z developed by the Division of Computational Linguistics (Lehrstuhl Prof. Hinrichs) at the Department of Linguistics (Seminar für Sprachwis-senschaft – SfS) of the Eberhard Karls Universität Tübingen,(More)
This paper presents a comparative study of probabilistic treebank parsing of Ger-man, using the Negra and TüBa-D/Z tree-banks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexical-ization of PCFG models boosts parsing performance for both treebanks. The(More)
The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the(More)
In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The fact that most parsers are solely evaluated on this specific data set leaves the question unanswered how much these results depend on the annotation scheme of the tree-bank. In this paper, we will investigate the influence which different decisions in the(More)
Parsing is a key task in natural language processing. It involves predicting, for each natural language sentence, an abstract representation of the grammatical entities in the sentence and the relations between these entities. This representation provides an interface to compositional semantics and to the notions of " who did what to whom. " The last two(More)