Learn More
SentimentWortschatz, or SentiWS for short, is a publicly available German-language resource for sentiment analysis, opinion mining etc. It lists positive and negative sentiment bearing words weighted within the interval of [−1; 1] plus their part of speech tag, and if applicable, their inflections. The current version of SentiWS (v1.8b) contains 1,650(More)
We present ExB Themis – a word alignment-based semantic textual similarity system developed for SemEval-2015 Task 2: Semantic Textual Similarity. It combines both string and semantic similarity measures as well as alignment features using Support Vector Regression. It occupies the first three places on Span-ish data and additionally places second on(More)
In this paper, we describe MLSA, a publicly available multi-layered reference corpus for German-language sentiment analysis. The construction of the corpus is based on the manual annotation of 270 German-language sentences considering three different layers of granularity. The sentence-layer annotation, as the most coarse-grained annotation, focuses on(More)
The programs for the Inpatient Quality Indicators (IQIs) can be downloaded from http://www.qualityindicators.ahrq.gov/. Instructions on how to use the programs to calculate the IQI rates are contained in the companion text, Inpatient Quality Indicators: Software Documentation (both SAS and SPSS). Preface In health care as in other arenas, that which cannot(More)
Preface In health care as in other arenas, that which cannot be measured is difficult to improve. Providers, consumers, policy makers, and others seeking to improve the quality of health care need accessible, reliable indicators of quality that they can use to flag potential problems or successes; follow trends over time; and identify disparities across(More)
  • Robert Remus
  • 2012
We propose an approach to domain adaptation that selects instances from a source domain training set, which are most similar to a target domain. The factor by which the original source domain training set size is reduced is determined automatically by measuring domain similarity between source and target domain as well as their domain complexity variance.(More)
We show that the quality of sentence-level subjectivity classification, i.e. the task of deciding whether a sentence is subjective or objective, can be improved by incorporating hitherto unused features: readability measures. Hence we investigate in 6 different readability formulae and propose an own. Their performance is evaluated in a 10-fold cross(More)
This paper describes University of Leipzig's approach to SemEval-2013 task 2B on Sentiment Analysis in Twitter: message polarity classification. Our system is designed to function as a baseline, to see what we can accomplish with well-understood and purely data-driven lexical features, simple generalizations as well as standard machine learning techniques:(More)
We propose a scheme for explicitly modeling and representing negation of word n-grams in an augmented word n-gram feature space. For the purpose of negation scope detection, we compare 2 methods: the simpler regular expression-based NegEx, and the more sophisticated Conditional Random Field-based LingScope. Additionally, we capture negation implicitly via(More)