Learn More
The quantification of lexical semantic relatedness has many applications in NLP, and many different measures have been proposed. We evaluate five of these measures, all of which use WordNet as their central resource, by comparing their performance in detecting and correcting real-word spelling errors. An information-content–based measure proposed by Jiang(More)
In text, lexical cohesion is the result of chains of related words that contribute to the continuity of lexical meaning. These lexical chains are a direct result of units of text being "about the same thing," and finding text structure involves finding units of text that are about the same thing. Hence, computing the chains is useful, since they will have a(More)
Five different proposed measures of similarity or semantic distance in WordNet were experimentally compared by examining their performance in a real-word spelling correction system. It was found that Jiang and Con-rath's measure gave the best results overall. That of Hirst and St-Onge seriously over-related, that of Resnik seriously under-related, and those(More)
Spelling errors that happen to result in a real word in the lexicon cannot be detected by a conventional spelling checker. We present a method for detecting and correcting many such errors by identifying tokens that are semantically unrelated to their context and are spelling variations of words that would be related to the context. Relatedness to context(More)
We develop a new computational model for representing the fine-grained meanings of near-synonyms and the differences between them. We also develop a lexical-choice process that can decide which of several near-synonyms is most appropriate in a particular situation. This research has direct applications in machine translation and text generation. We first(More)
Knowing the degree of antonymy between words has widespread applications in natural language processing. Manually-created lexicons have limited coverage and do not include most semantically contrasting word pairs. We present a new automatic and empirical measure of antonymy that combines corpus statistics with the structure of a published thesaurus. The(More)
We propose a framework to derive the distance between concepts from distribu-tional measures of word co-occurrences. We use the categories in a published thesaurus as coarse-grained concepts, allowing all possible distance values to be stored in a concept–concept matrix roughly .01% the size of that created by existing measures. We show that the newly(More)
The automatic ranking of word pairs as per their semantic relatedness and ability to mimic human notions of semantic relatedness has widespread applications. Measures that rely on raw data (distributional measures) and those that use knowledge-rich ontologies both exist. Although extensive studies have been performed to compare ontological measures with(More)
Knowing the degree of semantic contrast between words has widespread application in natural language processing, including machine translation, information retrieval, and dialogue systems. Manually-created lexicons focus on opposites, such as hot and cold. Opposites are of many kinds such as antipodals, complementaries, and gradable. However, existing(More)