Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities
This paper presents an unsupervised approach to solve semantic ambiguity based on the integration of the Personalized PageRank algorithm with word-sense frequency information. Natural Language tasks such as Machine Translation or Recommender Systems are likely to be enriched by our approach, which includes semantic information that obtains the appropriate word-sense via support from two sources: a multidimensional network that includes a set of different resources (i.e. WordNet, WordNet Domains, WordNet Affect, SUMO and Semantic Classes); and the information provided by word-sense frequencies and word-sense collocation from the SemCor Corpus. Our series of results were analyzed and compared against the results of several renowned studies using SensEval-2, SensEval-3 and SemEval-2013 datasets. After conducting several experiments, our procedure produced the best results in the unsupervised procedure category taking SensEval campaigns rankings as reference.