• Corpus ID: 11812

An Improved Model of Semantic Similarity Based on Lexical Co-Occurrence

@inproceedings{Rohde2005AnIM,
  title={An Improved Model of Semantic Similarity Based on Lexical Co-Occurrence},
  author={Douglas L. T. Rohde and David C. Plaut},
  year={2005}
}
The lexical semantic system is an important component of human language and cognitive processing. One approach to modeling semantic knowledge makes use of hand-constructed networks or trees of interconnected word senses (Miller, Beckwith, Fellbaum, Gross, & Miller, 1990; Jarmasz & Szpakowicz, 2003). An alternative approach seeks to model word meanings as high-dimensional vectors, which are derived from the cooccurrence of words in unlabeled text corpora (Landauer & Dumais, 1997; Burgess & Lund… 
Estimating the average need of semantic knowledge from distributional semantic models
TLDR
It is argued that CBOW is learning word meanings according to Anderson’s concept of needs probability and can account for nearly all of the variation in lexical access measures typically attributable to word frequency and contextual diversity.
The principals of meaning: Extracting semantic dimensions from co-occurrence models of semantics
TLDR
It is shown that the skip-gram model accounts for unique variance in behavioral measures of lexical access above and beyond that accounted for by affective and lexical measures, and it is raised the possibility that word frequency predicts Behavioral measures of Lexical access due to the fact that word use is organized by semantics.
The Role of Negative Information in Distributional Semantic Learning
TLDR
The role of negative information in developing a semantic representation is assessed and its power does not reflect the use of a prediction mechanism, and how negative information can be efficiently integrated into classic count-based semantic models using parameter-free analytical transformations is shown.
Performance impact of stop lists and morphological decomposition on word–word corpus-based semantic space models
TLDR
From this study, morphological decomposition appears to significantly improve performance in word–word co-occurrence semantic space models, providing some support for the claim that sublexical information—specifically, word morphology—plays a role in lexical semantic processing.
A hybrid method based on WordNet and Wikipedia for computing semantic relatedness between texts
TLDR
This work uses a collection of tow well known knowledge bases namely, WordNet and Wikipedia, so that provide more complete data source for calculate the semantic relatedness with a more accuracy.
A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity
TLDR
This paper introduces a clustering approach to orthogonalize the concept space in order to improve the accuracy of the similarity measure, and proposes an efficient and effective approach for semantic similarity using a large scale semantic network.
Organizing the space and behavior of semantic models
TLDR
A general framework for organizing the space of semantic models is proposed and it is illustrated how this framework can be used to understand model comparisons in terms of individual manipulations along sub-processes.
Semantic Similarity from Natural Language and Ontology Analysis
TLDR
This book proposes an in-depth characterization of existing proposals for semantic similarity estimation by discussing their features, the assumptions on which they are based and empirical results regarding their performance in particular applications, and provides a detailed discussion on the foundations of semantic measures.
Comparing Predictive and Co-occurrence Based Models of Lexical Semantics Trained on Child-directed Speech
TLDR
It is found that models that perform some form of abstraction outperform those that do not, and that co-occurrence-based abstraction models performed the best, however, different models excel at different categories, providing evidence for complementary learning systems.
Supervised word sense disambiguation using semantic diffusion kernel
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
An introduction to latent semantic analysis
TLDR
The adequacy of LSA's reflection of human knowledge has been established in a variety of ways, for example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word‐word and passage‐word lexical priming data.
Latent Semantic Analysis Approaches to Categorization
TLDR
Latent Semantic Analysis creates high dimensional vectors for concepts in semantic memory through statistical analysis of a large representative corpus of text rather than subjective feature sets linked to object names, and multivariate analyses of similarity matrices show more cohesive structure for natural kinds than for artifacts.
Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy
This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the
Using Measures of Semantic Relatedness for Word Sense Disambiguation
TLDR
This paper generalizes the Adapted Lesk Algorithm to a method of word sense disambiguation based on semantic relatedness and finds that the gloss overlaps of AdaptedLesk and the semantic distance measure of Jiang and Conrath (1997) result in the highest accuracy.
Producing high-dimensional semantic spaces from lexical co-occurrence
TLDR
A procedure that processes a corpus of text and produces numeric vectors containing information about its meanings for each word, which provide the basis for a representational model of semantic memory, hyperspace analogue to language (HAL).
A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge.
TLDR
A new general theory of acquired similarity and knowledge representation, latent semantic analysis (LSA), is presented and used to successfully simulate such learning and several other psycholinguistic phenomena.
Modelling Parsing Constraints with High-dimensional Context Space
TLDR
It is proposed that HAL's high-dim ensional context space can be used to provide a basic categorisation of semantic and grammatical concepts, model certain aspects of morphological ambiguity in verbs, and provide an account of semantic context effects in syntactic processing.
Roget's thesaurus and semantic similarity
TLDR
A system that measures semantic similarity using a computerized 1987 Roget's Thesaurus, and evaluated it by performing a few typical tests, comparing the results with those produced by WordNet-based similarity measures.
Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL
This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise
The Measurement of Textual Coherence with Latent Semantic Analysis.
TLDR
The approach for predicting coherence through reanalyzing sets of texts from 2 studies that manipulated the coherence of texts and assessed readers’ comprehension indicates that the method is able to predict the effect of text coherence on comprehension and is more effective than simple term‐term overlap measures.
...
1
2
3
4
5
...