Towards clustering-based word sense discrimination

Abstract

This paper describes a series of experiments conducted to group similar words using context features derived from a corpus. The goal is to find an approach that would be suitable for cleaning the fuzzy WordNet synsets obtained by automatic translation of Serbian synsets into Slovene. Similar techniques have been used successfully by a number of researches already and they are attractive particularly because they are knowledge-lean and based on evidence found in simple raw text. A selection of features and settings are tested on sample test sets with an unsupervised machine learning method called hierarchical clustering. In the final part of the paper, the obtained results are analyzed and the optimal set of features is selected, followed by a discussion of the results and some further

7 Figures and Tables

Cite this paper

@inproceedings{Fiser2006TowardsCW, title={Towards clustering-based word sense discrimination}, author={Darja Fiser and Spela Vintar and Ljupco Todorovski}, year={2006} }