• Publications
  • Influence
KLUE: Simple and robust methods for polarity classification
This paper uses simple bag-of-words models, a freely available sentiment dictionary automatically extended with distributionally similar terms, as well as lists of emoticons and internet slang abbreviations in conjunction with fast and robust machine learning algorithms to solve the SemEval-2013 sentiment analysis task.
SemantiKLUE: Robust Semantic Similarity at Multiple Levels Using Maximum Weight Matching
The SemantiKLUE system is a word-to-word alignment of two texts using a maximum weight matching algorithm that combines unsupervised and supervised techniques into a robust system for measuring semantic similarity.
SentiKLUE: Updating a Polarity Classifier in 48 Hours
SentiKLUE is an update of the KLUE polarity classifier – which achieved good and robust results in SemEval-2013 with a simple feature set – implemented in 48 hours.
Using High-Quality Resources in NLP: The Valency Dictionary of English as a Resource for Left-Associative Grammars
The Valency Dictionary of English can be regarded as being well suited for NLP purposes by being used for accurately parsing natural language with a rule-based approach and by integrating it into a Left-Associative Grammar.
JSLIM - Computational Morphology in the Framework of the SLIM Theory of Language
How the system works, the evolution from previous versions, and how the rules for word form recognition can be used also forword form generation are shown, and the subject of the reversibility of grammar rules is broached with the aim of an automatic word form production without any additional rule system.
Results of the Translation Inference Across Dictionaries 2019 Shared Task
An overall description of the Translation Inference Across Dictionary shared task, the evaluation data and methodology, and the systems’ results are given.
A Proposal for a Part-of-Speech Tagset for the Albanian Language
The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset that can adequately represent the underlying linguistic phenomena, and this paper presents a proposal for that tagset.
Albanian Part-of-Speech Tagging: Gold Standard and Evaluation
This paper provides mappings from the full tagset to both the original Google Universal Part-of-Speech Tags and the variant used in the Universal Dependencies project and achieves accuracies of up to 95.10%.
Translation Inference across Dictionaries via a Combination of Graph-based Methods and Co-occurrence Statistics
A graph-based approach which does not depend on cyclical translations and a combination of this method with a collocation-based model using the multilingually aligned Europarl corpus is proposed.
The_Illiterati: Part-of-Speech Tagging for Magahi and Bhojpuri without even knowing the alphabet
In this paper, we describe the part-of-speechtagging experiments for Magahi and Bhojpuri that we conducted for our participation in the NSURL 2019 shared tasks 9 and 10 (Lowlevel NLP Tools for