Modified Makagonov's Method for Testing Word Similarity and its Application to Constructing Word Frequency Lists

  journal={Research on computing science},
By (morphologically) similar wordforms we understand wordforms (strings) that have the same base meaning (roughly, the same root), such as sadly and sadden. The task of deciding whether two given strings are similar (in this sense) has numerous applications in text processing, e.g., in information retrieval, for which usually stemming is employed as an intermediate step. Makagonov has suggested a weakly supervised approach for testing word similarity, based on empirical formulae comparing the… 

