• Publications
  • Influence
UKP: Computing Semantic Textual Similarity by Combining Multiple Content Similarity Measures
This work uses a simple log-linear regression model, trained on the training data, to combine multiple text similarity measures of varying complexity, which range from simple character and word n-grams and common subsequences to complex features such as Explicit Semantic Analysis vector comparisons and aggregation of word similarity based on lexical-semantic resources.
Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary
This paper presents two application programming interfaces for Wikipedia and Wiktionary which are especially designed for mining the rich lexical semantic information dispersed in the knowledge bases, and provide efficient and structured access to the available knowledge.
Using Wiktionary for Computing Semantic Relatedness
It is shown that Wiktionary is the best lexical semantic resource in the ranking task and performs comparably to other resources in the word choice task, and the concept vector based approach yields the best results on all datasets in both evaluations.
Wisdom of crowds versus wisdom of linguists – measuring the semantic relatedness of words
A vector based measure of semantic relatedness is employed, relying on a concept space built from documents, to the first paragraph of Wikipedia articles, to English WordNet glosses, and to GermaNet based pseudo glosses.
Analysis of the Wikipedia Category Graph for NLP Applications
A graphtheoretic analysis of the category graph is performed, and it is shown that it is a scale-free, small world graph like other well-known lexical semantic networks.
Automatically Creating Datasets for Measures of Semantic Relatedness
A corpus-based system for automatically creating test datasets that cover all types of lexical-semantic relations and contain domain-specific words naturally occurring in texts is proposed.
Approximate Matching for Evaluating Keyphrase Extraction
For the first time, the results of state-of-the-art unsupervised and supervised keyphrase extraction approaches on three evaluation datasets are compared and it is shown that the relative performance of the approaches heavily depends on the evaluation metric as well as on the properties of the evaluation dataset.
SemEval-2013 Task 5: Evaluating Phrasal Semantics
This paper describes the SemEval-2013 Task 5: “Evaluating Phrasal Semantics”, and introduces the systems that participated and discusses evaluation results.
Language Technologies for the Challenges of the Digital Age Proceedings of the GermEval 2017 – Shared Task on Aspect-based Sentiment in Social Media Customer Feedback
This paper describes the GermEval 2017 shared task on Aspect-Based Sentiment Analysis that consists of four subtasks: relevance, document-level sentiment polarity, aspect-level polarity ad opinion
DKPro Similarity: An Open Source Framework for Text Similarity
The goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces and come with a set of full-featured experimental setups which can be run out-of-the-box and be used for future systems to built upon.