• Publications
  • Influence
Detecting Protagonists in German Plays around 1800 as a Classification Task
In this paper, we aim at identifying protagonists in plays automatically. To this end, we train a classifier using various features and investigate the importance of each feature. A challenging
GerDraCor-Coref: A Coreference Corpus for Dramatic Texts in German
An annotated corpus of German dramatic texts is presented, a preliminary analysis of the corpus is presented as well as some baseline experiments on automatic CR, and the analysis shows that with respect to the reference structure, dramatic texts are very different from news texts, but more similar to other dialogical text types such as interviews.
Towards Bridging Resolution in German: Data Analysis and Rule-based Experiments
This work presents two datasets which contain bridging annotations, namely DIRNDL and GRAIN, and compares the performance of a rule-based system with a simple baseline approach on these two corpora.
Eponymous Heroes and Protagonists –
Within literary studies, there is a coexistence of di erent perspectives on protagonists, heroes or main characters in dramatic texts, which provide di erent de nitions and strategies for the identi
A Unified Text Annotation Workflow for Diverse Goals
In computational linguistics (CL), annotation is used with the goal of compiling data as the basis for machine learning approaches and automation. At the same time, in the Humanities scholars use
DramaCoref: A Hybrid Coreference Resolution System for German Theater Plays
We present a system for resolving coreference on theater plays, DramaCoref. The system uses neural network techniques to provide a list of potential mentions. These mentions are assigned to common
Predicting Structural Elements in German Drama
This work addresses the challenge of enriching plain text dramas with predicted TEI/XML elements by fine-tuning a pre-trained BERT transformer model on several subtasks, and shows that the used architecture is able to predict the learned structural elements on unseen data for several settings and models.
Measuring the Compositionality of Noun-Noun Compounds over Time
This work uses the time-stamped Google Books corpus for diachronic investigations, and examines whether the vector-based semantic spaces extracted from this corpus are able to predict compositionality ratings, despite their inherent limitations.