Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings
@article{Chang2018EfficientGW, title={Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings}, author={Haw-Shiuan Chang and Amol Agrawal and Ananya Ganesh and Anirudha Desai and Vinayak Mathur and Alfred Hough and Andrew McCallum}, journal={ArXiv}, year={2018}, volume={abs/1804.03257} }
Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable. This paper proposes an accurate and efficient graph-based method for WSI that builds a global non-negative vector embedding basis (which are interpretable like topics) and clusters the basis indexes in the ego network of each polysemous word. By adopting distributional inclusion vector…Â
8 Citations
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
- Computer ScienceNAACL
- 2018
Distributional inclusion vector embedding (DIVE) is introduced, a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings which preserve the inclusion property of word contexts.
AutoSense Model for Word Sense Induction
- Computer ScienceAAAI
- 2019
This work proposes AutoSense, a latent variable model based on two observations: senses are represented as a distribution over topics, and senses generate pairings between the target word and its neighboring word, and shows that AutoSense is evidently better than competing models.
MSC+: Language pattern learning for word sense induction and disambiguation
- Computer ScienceKnowl. Based Syst.
- 2020
Graph Based Word Sense Disambiguation
- Computer Science
- 2017
Traditional Page Rank algorithms and random walk approaches are compared extensively and knowledge-based approaches are becoming most popular than other approaches for word-sense disambiguation.
CKG: Dynamic Representation Based on Context and Knowledge Graph
- Computer Science2020 25th International Conference on Pattern Recognition (ICPR)
- 2021
This paper argues that entities in KGs could be used to enhance the correct semantic meaning of language sentences and proposes a new method CKG: Dynamic Representation Based on Context and Knowledge Graph, which can extract rich semantic information of large corpus and make full use of inside information such as co-occurrence in large corpus.
Word sense disambiguation using hybrid swarm intelligence approach
- Computer SciencePloS one
- 2018
A hybrid meta-heuristic method that consists of particle swarm optimization (PSO) and simulated annealing to find the global best meaning of a given text and has superior performance in comparison with state-of-the-art approaches is proposed.
MÉTODOS COMPUTACIONALES PARA LA DELAMINACIÓN DE GRAFITO UTILIZANDO SURFACTANTES ANIONICOS PARA PRODUCIR GRAFENO
- ChemistryRevista de Investigaciones Universidad del QuindÃo
- 2019
Today, using thermal and chemical reduction and solubility, graphene oxide is produced in large scale. Since there are various methods for producing graphene, each of which allocates properties to…
Generating sense inventories for ambiguous arabic words
- Computer ScienceInt. Arab J. Inf. Technol.
- 2021
The experiment of replacing ambiguous words with their sense vectors is tested for sentence similarity using all sense inventories and the results show that using Aravec-Twitter sense inventory provides a better correlation value.
References
SHOWING 1-10 OF 36 REFERENCES
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
- Computer ScienceNAACL
- 2018
Distributional inclusion vector embedding (DIVE) is introduced, a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings which preserve the inclusion property of word contexts.
Linear Algebraic Structure of Word Senses, with Applications to Polysemy
- Computer ScienceTACL
- 2018
It is shown that multiple word senses reside in linear superposition within the word embedding and simple sparse coding can recover vectors that approximately capture the senses.
Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space
- Computer ScienceEMNLP
- 2014
An extension to the Skip-gram model that efficiently learns multiple embeddings per word type is presented, and its scalability is demonstrated by training with one machine on a corpus of nearly 1 billion tokens in less than 6 hours.
Making Sense of Word Embeddings
- Computer ScienceRep4NLP@ACL
- 2016
This work presents a simple yet effective approach that can induce a sense inventory from existing word embeddings via clustering of ego-networks of related words and an integrated WSD mechanism enables labeling of words in context with learned sense vectors.
Do Multi-Sense Embeddings Improve Natural Language Understanding?
- Computer ScienceEMNLP
- 2015
A multisense embedding model based on Chinese Restaurant Processes is introduced that achieves state of the art performance on matching human word similarity judgments, and a pipelined architecture for incorporating multi-sense embeddings into language understanding is proposed.
Word Sense Induction for Novel Sense Detection
- Computer ScienceEACL
- 2012
This work applies topic modelling to automatically induce word senses of a target word, and demonstrates that the proposed model can be used to automatically detect words with emergent novel senses, as well as token occurrences of those senses.
Two graph-based algorithms for state-of-the-art WSD
- Computer ScienceEMNLP
- 2006
The results show that, in spite of the information loss inherent to mapping the induced senses to the gold-standard, the optimization of parameters based on a small sample of nouns carries over to all nouns, performing close to supervised systems in the lexical sample task and yielding the second-best WSD systems for the Senseval-3 all-words task.
Inducing Word Senses to Improve Web Search Result Clustering
- Computer ScienceEMNLP
- 2010
This work first acquires the senses of a query by means of a graph-based clustering algorithm that exploits cycles in the co-occurrence graph of the query, then clusters the search results based on their semantic similarity to the induced word senses.
Discovering word senses from text
- Computer ScienceKDD '02
- 2002
A clustering algorithm called CBC (Clustering By Committee) that automatically discovers word senses from text that initially discovers a set of tight clusters called committees that are well scattered in the similarity space.
Improving Distributional Similarity with Lessons Learned from Word Embeddings
- Computer ScienceTACL
- 2015
It is revealed that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves, and these modifications can be transferred to traditional distributional models, yielding similar gains.