• Publications
  • Influence
Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
This paper presents and analyzes parsing results obtained by the task participants, and provides an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.
Is it Really that Difficult to Parse German?
Parser performance for the models trained on TuBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank, suggesting that German is not harder to parse than its West-Germanic neighbor language English.
Discontinuous parsing with continuous trees
We introduce a new method for incremental shift-reduce parsing of discontinuous constituency trees, based on the fact that discontinuous trees can be transformed into continuous trees by changing the
Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems
This paper presents the first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRSs), and shows that data-driven LCFRS parsing is feasible and yields output of competitive quality.
Direct Parsing of Discontinuous Constituents in German
This paper uses a parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS), a formalism with high expressivity, to directly parse the German NeGra and TIGER treebanks, and shows that an output quality can be achieved which is comparable to the output quality of PCFG-based systems.
Discosuite - A parser test suite for German discontinuous structures
A test suite for testing the performance of dependency and constituency parsers on non-projective dependencies and discontinuous constituents for German, based on the newly released TIGER treebank version 2.2.2, which includes a linguistic analysis of the phenomena that cause discontinuity in the TIGer annotation.
Discontinuous Incremental Shift-reduce Parsing
We present an extension to incremental shift-reduce parsing that handles discontinuous constituents, using a linear classifier and beam search. We achieve very high parsing speeds (up to 640
The IUCL+ System: Word-Level Language Identification via Extended Markov Models
The IUCL+ system combines character n-gram probabilities, lexical probabilities, word label transition probabilities and existing named entity recognitiontools within a Markovmodel framework that weights these components and assigns a label.
PLCFRS Parsing Revisited: Restricting the Fan-Out to Two
This paper presents a parser for binary PLCFRS of fan-out two, together with a novel monotonous estimate for A parsing, and conducts experiments on modified versions of the German NeGra treebank and the Discontinuous Penn Treebank in which all trees have block degree two.
Annotating Coordination in the Penn Treebank
This paper presents an annotation scheme for the Penn Treebank which introduces a distinction between coordinating from non-coordinating punctuation and shows that this additional annotation allows the retrieval of a considerable number of coordinate structures beyond the ones having a coordinating conjunction.