Learn More
This paper describes the two systems for determining the semantic similarity of short texts submitted to the SemEval 2012 Task 6. Most of the research on semantic similarity of textual content focuses on large documents. However, a fair amount of information is condensed into short text snippets such as social media posts, image captions, and scientific(More)
In online discussions, users often back up their stance with arguments. Their arguments are often vague, implicit, and poorly worded, yet they provide valuable insights into reasons underpinning users’ opinions. In this paper, we make a first step towards argument-based opinion mining from online discussions and introduce a new task of argument recognition.(More)
Online debates sparkle argumentative discussions from which generally accepted arguments often emerge. We consider the task of unsupervised identification of prominent argument in online debates. As a first step, in this paper we perform a cluster analysis using semantic textual similarity to detect similar arguments. We perform a preliminary cluster(More)
Derivational models are still an underresearched area in computational morphology. Even for German, a rather resourcerich language, there is a lack of largecoverage derivational knowledge. This paper describes a rule-based framework for inducing derivational families (i.e., clusters of lemmas in derivational relationships) and its application to create a(More)
In this paper we present CroNER, a named entity recognition and classification system for Croatian language based on supervised sequence labeling with conditional random fields (CRF). We use a rich set of lexical and gazetteer-based features and different methods for enforcing document-level label consistency. Extensive evaluation shows that our method(More)
Collocations are linguistic phenomena that occur when two or more words appear together more often than by chance and whose meaning often cannot be inferred from the meanings of its parts. As collocations have found many applications in the fields of natural language processing, information retrieval, and text mining, extracting them from large corpora has(More)
Identifying news stories that discuss the same real-world events is important for news tracking and retrieval. Most existing approaches rely on the traditional vector space model. We propose an approach for recognizing identical real-world events based on a structured, event-oriented document representation. We structure documents as graphs of event(More)