Anke Lüdeling

Learn More
Tools for linguistic annotation employ different data models and accompanying visualization metaphors, depending on the particular type of annotation envisaged. When a corpus is to be annotated on multiple layers, and the annotations are to be related to one another, the output formats of the annotation tools need to be unified. We describe an implemented(More)
The three important aspects are unintentionality, unlimitedness, and regularity. They are all interdependent. The first aspect – unintentionality – helps us to distinguish between productivity (which is a linguistic rule-based notion) and creativity (which is a general cognitive ability and cannot be captured within morphology alone): Words formed by(More)
Parsing learner data poses a great challenge for standard tools, since non-canonical and unusual structures may lead to wrong interpretations on the part of the taggers and parsers. It is well known that providing a statistical parser with perfect part-of-speech (POS) tags is of great benefit for parsing accuracy, and that parsing results can decrease(More)
This paper describes an approach for storing and querying a large corpus of linguistically annotated historical texts in a relational database management system. Texts in such a corpus have a complex structure consisting of multiple text layers that are richly annotated and aligned to each other. Modeling and managing such corpora poses various challenges(More)
1. Introduction Our study is concerned with the identification of 'difficult' structures in the acquisition of a foreign language, which will shed light on theoretical considerations of L2 processing. We argue that – compared to simple vocabulary items or abstract syntactic patterns – structures that contain lexical material as well as categorial variables(More)
This paper presents the design and architecture of a diachronic corpus of German. We describe the corpus architecture with a focus on the use and restrictions of XML as the data exchange and storage format. In our approach, a relational database will supplement the XML representation to support sophisticated search and presentation facilities. This is a(More)
Learner corpora consist of texts produced by non-native speakers. In addition to these texts, some learner corpora also contain error annotations, which can reveal common errors made by language learners, and provide training material for automatic error correction. We present a novel type of error-annotated learner corpus containing sequences of revised(More)