• Publications
  • Influence
Developing a large semantically annotated corpus
TLDR
It is argued that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the starting ingredients for fulfilling this enterprise. Expand
Gamification for Word Sense Labeling
TLDR
This work shows how gold standard data for word sense disambiguation can be obtained using a “Game with a Purpose” (GWAP) called Wordrobe, which consists of a large set of multiple-choice questions on word senses generated from the Groningen Meaning Bank. Expand
The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations
TLDR
The approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. Expand
PLCFRS Parsing of English Discontinuous Constituents
TLDR
This paper uses probabilistic linear context-free rewriting systems for data-driven parsing, following recent work on parsing German, and demonstrates that by discarding information on non-local dependencies the PCFG model loses important information on syntactic dependencies in general. Expand
UGroningen: Negation detection with Discourse Representation Structures
TLDR
This toolchain applies the C&C tools for parsing, using the formalism of Combinatory Categorial Grammar, and applies Boxer to produce semantic representations in the form of Discourse Representation Structures (DRSs). Expand
Kahina: A Hybrid Trace-Based and Chart-Based Debugging System for Grammar Engineering
TLDR
An overview of the debugging framework Kahina is provided, discussing its architecture as well as its application to debugging in different constraint-based grammar engineering environments and the hybrid nature of the system between source-level debugging by means of a tracer and high-level analysis by meansof graphical tools. Expand
Elephant: Sequence Labeling for Word and Sentence Segmentation
TLDR
It is shown that highaccuracy word and sentence segmentation can be achieved by using supervised sequence labeling on the character level combined with unsupervised feature learning. Expand
A platform for collaborative semantic annotation
TLDR
This work is building a large corpus of public-domain English texts and annotate them semi-automatically with syntactic structures and semantic representations, including events, thematic roles, named entities, anaphora, scope, and rhetorical structure. Expand
TuLiPA: Towards a Multi-Formalism Parsing Environment for Grammar Engineering
TLDR
An open-source parsing environment (Tubingen Linguistic Parsing Architecture, TuLiPA) is presented which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. Expand
...
1
2
3
4
...