• Publications
  • Influence
An Information-Theoretic Definition of Similarity
TLDR
This work presents an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model and demonstrates how this definition can be used to measure the similarity in a number of different domains. Expand
Automatic Retrieval and Clustering of Similar Words
  • Dekang Lin
  • Computer Science
  • COLING-ACL
  • 10 August 1998
TLDR
A word similarity measure based on the distributional pattern of words allows the automatically constructed thesaurus to be significantly closer to WordNet than Roget Thesaurus is. Expand
Dependency-Based Evaluation of Minipar
TLDR
A dependency-based method for parser evaluation is presented and a broad-coverage parser, called MINIPAR, is evaluated with the SUSANNE corpus. Expand
DIRT @SBT@discovery of inference rules from text
TLDR
This paper proposes an unsupervised method for discovering inference rules from text, based on an extended version of Harris' Distributional Hypothesis, which states that words that occurred in the same contexts tend to be similar. Expand
Discovering word senses from text
TLDR
A clustering algorithm called CBC (Clustering By Committee) that automatically discovers word senses from text that initially discovers a set of tight clusters called committees that are well scattered in the similarity space. Expand
Discovery of inference rules for question-answering
TLDR
This paper presents an unsupervised algorithm for discovering inference rules from text based on an extended version of Harris’ Distributional Hypothesis, which states that words that occurred in the same contexts tend to be similar. Expand
DIRT – Discovery of Inference Rules from Text
In this paper, we propose an unsupervised method for discovering inference rules from text, such as “X is author of Y ≈ X wrote Y”, “X solved Y ≈ X found a solution to Y”, and “X caused Y ≈ Y isExpand
Automatic Identification of Non-compositional Phrases
TLDR
This work presents a method for automatic identification of non-compositional expressions using their statistical properties in a text corpus based on the hypothesis that when a phrase is non-Compositional, its mutual information differs significantly from the mutual informations of phrases obtained by substituting one of the word in the phrase with a similar word. Expand
Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity
TLDR
An algorithm is presented that uses the same knowledge sources to disambiguate different words and does not require a sense-tagged corpus and exploits the fact that two different words are likely to have similar meanings if they occur in identical local contexts. Expand
PRINCIPAR - An Efficient, Broad-coverage, Principle-based Parser
TLDR
An efficient, broad-coverage, principle-based parser for English that contains a lexicon with over 90,000 entries, constructed automatically by applying a set of extraction and conversion rules to entries from machine readable dictionaries. Expand
...
1
2
3
4
5
...