This paper presents a new measure of semantic similarity in an IS-A taxonomy, based on the notion of information content, which performs encouragingly well and is significantly better than the traditional edge counting approach.
This article presents a measure of semantic similarity in an IS-A taxonomy based on the notion of shared information content that performs better than the traditional edge-counting approach.
The use of supervised learning based on structural features of documents to improve classification performance, a new content-based measure of translational equivalence, and adaptation of the system to take advantage of the Internet Archive for mining parallel text from the Web on a large scale are presented.
Using parallel text to help solving the problem of creating syntactic annotation in more languages by annotating the English side of a parallel corpus, project the analysis to the second language, and train a stochastic analyzer on the resulting noisy annotations.
A new, information-theoretic account of selectional constraints is proposed, which assumes that lexical items are organized in a conceptual taxonomy according to class membership, where classes are defined simply as sets rather than in terms of explicit features or properties.
This work explores the use of the MIRA algorithm of Crammer et al. as an alternative to MERT and shows that by parallel processing and exploiting more of the parse forest, it can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost.
A method for automatic sense disambiguation of nouns appearing within sets of related nouns — the kind of data one finds in on-line thesauri, or as the output of distributional clustering algorithms.
A new corpus- based approach to prepositional phrase attachment disambiguation is described, and results comparing performance of this algorithm with other corpus-based approaches to this problem are presented.
It is argued that word lattice decoding provides a compelling model for translation of text genres, as well, and that prior work in translating lattices using finite state techniques can be naturally extended to more expressive synchronous context-free grammarbased models.
Workshop On Tagging Text With Lexical Semantics…
1997
TLDR
This paper explores how a statistical model of selectional preference, requiring neither manual annotation of selection restrictions nor supervised training, can be used in sense disambiguation, and combines statistical and knowledge-based methods.