Jonathan Sonntag

Learn More
We introduce GraPAT, a web-based annotation tool for building graph structures over text. Graphs have been demonstrated to be relevant in a variety of quite diverse annotation efforts and in different NLP applications, and they serve to model annotators’ intuitions quite closely. In particular, in this paper we discuss the implementation of graph(More)
We develop a pipeline consisting of various text processing tools which is designed to assist political scientists in finding specific, complex concepts within large amounts of text. Our main focus is the interaction between the political scientists and the natural language processing groups to ensure a beneficial assistance for the political scientists and(More)
Accurate opinion mining requires the exact identification of the source and target of an opinion. To evaluate diverse tools, the research community relies on the existence of a gold standard corpus covering this need. Since such a corpus is currently not available for German, the Interest Group on German Sentiment Analysis decided to create such a resource(More)
We compare the performance of two lexiconbased sentiment systems – SentiStrength (Thelwall et al., 2012) and SO-CAL (Taboada et al., 2011) – on the two genres of newspaper text and tweets. While SentiStrength has been geared specifically toward short social-media text, SO-CAL was built for general, longer text. After the initial comparison, we successively(More)
We present the German Sentiment Analysis Shared Task (GESTALT) which consists of two main tasks: Source, Subjective Expression and Target Extraction from Political Speeches (STEPS) and Subjective Phrase and Aspect Extraction from Product Reviews (StAR). Both tasks focused on fine-grained sentiment analysis, extracting aspects and targets with their(More)
A simple conceptual model is employed to investigate events, and break the task of coreference resolution into two steps: semantic class detection and similaritybased matching. With this perspective an algorithm is implemented to cluster event mentions in a large-scale corpus. Results on test data from AQUAINT TimeML, which we annotated manually with(More)
We work on tools to explore text contents and metadata of newspaper articles as provided by news archives. Our tool components are being integrated into an “Exploration Workbench” for Digital Humanities researchers. Next to the conversion of different data formats and character encodings, a prominent feature of our design is its “Wizard” function for corpus(More)
  • 1