Thomas Proisl

Learn More
This paper describes our approach to the SemEval-2013 task on “Sentiment Analysis in Twitter”. We use simple bag-of-words models, a freely available sentiment dictionary automatically extended with distributionally similar terms, as well as lists of emoticons and internet slang abbreviations in conjunction with fast and robust machine learning algorithms.(More)
Being able to quantify the semantic similarity between two texts is important for many practical applications. SemantiKLUE combines unsupervised and supervised techniques into a robust system for measuring semantic similarity. At the core of the system is a word-to-word alignment of two texts using a maximum weight matching algorithm. The system(More)
Abstract. JSLIM is a software system for writing grammars in accordance with the SLIM theory of language. Written in Java, it is designed to facilitate the coding of grammars for morphology as well as for syntax and semantics. This paper describes the system with a focus on morphology. We show how the system works, the evolution from previous versions, and(More)
Burrows’s Delta is the most established measure for stylometric difference in literary authorship attribution. Several improvements on the original Delta have been proposed. However, a recent empirical study showed that none of the proposed variants constitute a major improvement in terms of authorship attribution performance. With this paper, we try to(More)
This paper describes our system entered for the *SEM 2013 shared task on Semantic Textual Similarity (STS). We focus on the core task of predicting the semantic textual similarity of sentence pairs. The current system utilizes machine learning techniques trained on semantic similarity ratings from the *SEM 2012 shared task; it achieved rank 20 out of 90(More)
In Natural Language Processing (NLP), the quality of a system depends to a great extent on the quality of the linguistic resources it uses. Due to the unpredictable character of valency properties, a reliable source for information about valency is important for syntactic and semantic analysis. With this in mind, we discuss how the Valency Dictionary of(More)
This system description explains how to use several bilingual dictionaries and aligned corpora in order to create translation candidates for novel language pairs. It proposes (1) a graph-based approach which does not depend on cyclical translations and (2) a combination of this method with a collocation-based model using the multilingually aligned Europarl(More)