Multi-sense embeddings through a word sense disambiguation process

@article{Ruas2019MultisenseET,
  title={Multi-sense embeddings through a word sense disambiguation process},
  author={Terry Ruas and W. Grosky and Akiko Aizawa},
  journal={ArXiv},
  year={2019},
  volume={abs/2101.08700}
}
Abstract Natural Language Understanding has seen an increasing number of publications in the last few years, especially after robust word embeddings models became prominent, when they proved themselves able to capture and represent semantic relationships from massive amounts of data. Nevertheless, traditional models often fall short in intrinsic issues of linguistics, such as polysemy and homonymy. Any expert system that makes use of natural language in its core, can be affected by a weak… Expand
EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses
TLDR
The rich semantic structures in WordNet are leveraged using a graph-theoretic walk technique over word senses to enhance the quality of multisense embeddings and derive new distributional semantic similarity measures for M-SE from prior ones. Expand
Enhanced word embeddings using multi-semantic representation through lexical chains
TLDR
This work proposes two novel algorithms, called Flexible Lexical Chain II and Fixed Lexical chain II, that combine the semantic relations derived from lexical chains, prior knowledge from lexicals databases, and the robustness of the distributional hypothesis in word embeddings as building blocks forming a single system. Expand
Context expansion approach for graph-based word sense disambiguation
TLDR
The proposed method can capture a higher degree of semantic information than existing approaches, thereby increasing semantic connectivity through a graph’s edges, and demonstrate that the overall sentiment orientation of a given textual context can be determined. Expand
TensSent: a tensor based sentimental word embedding method
TLDR
This study proposes two novel unsupervised models to integrating word polarity information and word co-occurrences as more tailored features for sentiment analysis. Expand
Math-word embedding in math search and semantic extraction
TLDR
This paper explores math embedding by testing it on several different scenarios, and shows that it holds much promise for similarity, analogy, and search tasks, however, the need for more robustmath embedding approaches is observed. Expand
Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles
TLDR
This paper model the problem of finding the relationship between two documents as a pairwise document classification task, and suggests that classifying semantic relations between documents is a solvable task and motivates the development of a recommender system based on the evaluated techniques. Expand
Identifying Machine-Paraphrased Plagiarism
TLDR
The effectiveness of five pre-trained word embedding models combined with machine learning classifiers and state-of-the-art neural language models are evaluated, and it is shown that the automated classification alleviates shortcomings of widelyused text-matching systems, such as Turnitin and PlagScan. Expand
Why Machines Cannot Learn Mathematics, Yet
TLDR
This work applies popular text embedding techniques to the arXiv collection of STEM documents and investigates how these are unable to properly understand mathematics from that corpus, and investigates the missing aspects that would allow mathematics to be learned by computers. Expand
FSPRM: A Feature Subsequence Based Probability Representation Model for Chinese Word Embedding
TLDR
A Feature Subsequence based Probability Representation Model (FSPRM) is proposed for learning Chinese word embeddings, in which the morphological and phonetic features of Chinese characters are integrated and their relevance is considered by designing a feature subsequence. Expand
Contextual Document Similarity for Content-based Literature Recommender Systems
To cope with the ever-growing information overload, an increasing number of digital libraries employ content-based recommender systems. These systems traditionally recommend related documents withExpand
...
1
2
...

References

SHOWING 1-10 OF 94 REFERENCES
sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings
TLDR
A novel approach is presented which provides a fast and accurate way for a consuming NLP model to select a sense-disambiguated embedding, which can disambiguate both contrastive senses such as nominal and verbal senses as well as nuanced sensessuch as sarcasm. Expand
Embeddings for Word Sense Disambiguation: An Evaluation Study
TLDR
This work proposes different methods through which word embeddings can be leveraged in a state-of-the-art supervised WSD system architecture, and performs a deep analysis of how different parameters affect performance. Expand
Do Multi-Sense Embeddings Improve Natural Language Understanding?
TLDR
A multisense embedding model based on Chinese Restaurant Processes is introduced that achieves state of the art performance on matching human word similarity judgments, and a pipelined architecture for incorporating multi-sense embeddings into language understanding is proposed. Expand
Embedding Words and Senses Together via Joint Knowledge-Enhanced Training
TLDR
This work proposes a new model which learns word and sense embeddings jointly and exploits large corpora and knowledge from semantic networks in order to produce a unified vector space of word and senses. Expand
SensEmbed: Learning Sense Embeddings for Word and Relational Similarity
TLDR
This work proposes a multifaceted approach that transforms word embeddings to the sense level and leverages knowledge from a large semantic network for effective semantic similarity measurement. Expand
Improving Distributed Representation of Word Sense via WordNet Gloss Composition and Context Clustering
TLDR
The learned represen-tations outperform the publicly available embeddings on 2 out of 4 metrics in the word similarity task, and 6 out of 13 sub tasks in the analogical reasoning task. Expand
Simple Embedding-Based Word Sense Disambiguation
TLDR
A knowledge-based WSD method that uses word and sense embeddings to compute the similarity between the gloss of a sense and the context of the word and the results show that by lexically extending the amount of words in the gloss and context, although it works well for other implementations of Lesk, harms this method. Expand
Entity Linking meets Word Sense Disambiguation: a Unified Approach
TLDR
Babelfy is presented, a unified graph-based approach to EL and WSD based on a loose identification of candidate meanings coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations. Expand
Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities
TLDR
A novel multilingual vector representation, called Nasari, is put forward, which not only enables accurate representation of word senses in different languages, but it also provides two main advantages over existing approaches: high coverage and comparability across languages and linguistic levels. Expand
De-Conflated Semantic Representations
TLDR
This work proposes a technique that tackles semantic representation problems by de-conflating the representations of words based on the deep knowledge it derives from a semantic network, including its high coverage and the ability to generate accurate representations even for infrequent word senses. Expand
...
1
2
3
4
5
...