Sense-linking in a Machine Readable Dictionary 2 Explicit Sense Links

Abstract

LDOCE), is a dictionary for learners of English as Dictionaries contain a rich set of relationships between their senses, but often these relationships are only implicit. We report on our experiments to automatically identify links between the senses in a machinereadable dictionary. In particular, we automatically identify instances of zero-affix morphology, and use that information to find specific linkages between senses. This work has provided insight into the performance of a stochastic tagger. 1 I n t r o d u c t i o n Machine-readable dictionaries contain a rich set of relationships between their senses, and indicate them in a variety of ways. Sometimes the relationship is provided explicitly, such as with a synonym or antonym reference. More commonly the relationship is only implicit, and needs to be uncovered through outside mechanisms. This paper describes our efforts at identifying these links. The purpose of the research is to obtain a better understanding of the relationships between word meanings, and to provide data for our work on wordsense disambiguation and information retrieval. Our hypothesis is that retrieving documents on the basis of word senses (instead of words) will result in better performance. Our approach is to treat the information associated with dictionary senses (part of speech, subcategorization, subject area codes, etc.) as multiple sources of evidence (cf. Krovetz [3]). This process is fundamentally a divisive one, and each of the sources of evidence has exceptions (i.e., instances in which senses are related in spite of being separated by part of speech, subcategorization, or morphology). Identifying related senses will help us to test the hypothesis that unrelated meanings will be more effective at separating relevant from nonrelevant documents than meanings which are related. We will first discuss some of the explicit indications of sense relationships as found in usage notes and deictic references. We will then describe our efforts at uncovering the implicit relationships via stochastic tagging and word collocation. 2 Explicit Sense Links The dictionary we are using in our research, the Longman Dictionary of Contemporary English a second language. As such, it provides a great deal of information about word meanings in the form of example sentences, usage notes, and grammar codes. The Longman dictionary is also unique among learner's dictionaries in that its definitions are generally written using a controlled vocabulary of approximately 2200 words. When exceptions occur they are indicated by means of a different font. For example, consider the definition of the word

Extracted Key Phrases

Cite this paper

@inproceedings{Krovetz1992SenselinkingIA, title={Sense-linking in a Machine Readable Dictionary 2 Explicit Sense Links}, author={Robert Krovetz}, year={1992} }