Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
- Djamé Seddah, Reut Tsarfaty, Eric Villemonte de la Clergerie
- Computer ScienceSPMRL@EMNLP
- 18 October 2013
This paper presents and analyzes parsing results obtained by the task participants, and provides an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.
Is it Really that Difficult to Parse German?
- Sandra Kübler, E. Hinrichs, Wolfgang Maier
- Computer ScienceConference on Empirical Methods in Natural…
- 22 July 2006
Parser performance for the models trained on TuBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank, suggesting that German is not harder to parse than its West-Germanic neighbor language English.
Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems
- Laura Kallmeyer, Wolfgang Maier
- Computer ScienceInternational Conference on Computational…
- 23 August 2010
This paper presents the first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRSs), and shows that data-driven LCFRS parsing is feasible and yields output of competitive quality.
Discontinuous parsing with continuous trees
We introduce a new method for incremental shift-reduce parsing of discontinuous constituency trees, based on the fact that discontinuous trees can be transformed into continuous trees by changing the…
PLCFRS Parsing Revisited: Restricting the Fan-Out to Two
This paper presents a parser for binary PLCFRS of fan-out two, together with a novel monotonous estimate for A parsing, and conducts experiments on modified versions of the German NeGra treebank and the Discontinuous Penn Treebank in which all trees have block degree two.
Direct Parsing of Discontinuous Constituents in German
- Wolfgang Maier
- Computer ScienceSPMRL@NAACL-HLT
- 5 June 2010
This paper uses a parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS), a formalism with high expressivity, to directly parse the German NeGra and TIGER treebanks, and shows that an output quality can be achieved which is comparable to the output quality of PCFG-based systems.
Discosuite - A parser test suite for German discontinuous structures
- Wolfgang Maier, Miriam Kaeshammer, Peter Baumann, Sandra Kübler
- Computer ScienceInternational Conference on Language Resources…
- 1 May 2014
A test suite for testing the performance of dependency and constituency parsers on non-projective dependencies and discontinuous constituents for German, based on the newly released TIGER treebank version 2.2.2, which includes a linguistic analysis of the phenomena that cause discontinuity in the TIGer annotation.
Discontinuous Incremental Shift-reduce Parsing
- Wolfgang Maier
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2015
We present an extension to incremental shift-reduce parsing that handles discontinuous constituents, using a linear classifier and beam search. We achieve very high parsing speeds (up to 640…
The IUCL+ System: Word-Level Language Identification via Extended Markov Models
The IUCL+ system combines character n-gram probabilities, lexical probabilities, word label transition probabilities and existing named entity recognitiontools within a Markovmodel framework that weights these components and assigns a label.
Annotating Coordination in the Penn Treebank
This paper presents an annotation scheme for the Penn Treebank which introduces a distinction between coordinating from non-coordinating punctuation and shows that this additional annotation allows the retrieval of a considerable number of coordinate structures beyond the ones having a coordinating conjunction.