Enhanced word embeddings using multi-semantic representation through lexical chains
@article{Ruas2020EnhancedWE, title={Enhanced word embeddings using multi-semantic representation through lexical chains}, author={Terry Ruas and Charles Henrique Porto Ferreira and William I. Grosky and Fabr{\'i}cio Olivetti de França and Debora Maria Rossi de Medeiros}, journal={ArXiv}, year={2020}, volume={abs/2101.09023} }
11 Citations
Math-word embedding in math search and semantic extraction
- Computer ScienceScientometrics
- 2020
This paper explores math embedding by testing it on several different scenarios, and shows that it holds much promise for similarity, analogy, and search tasks, however, the need for more robustmath embedding approaches is observed.
Math-word embedding in math search and semantic extraction
- Computer ScienceScientometrics
- 2020
This paper explores math embedding by testing it on several different scenarios, and shows that it holds much promise for similarity, analogy, and search tasks, however, the need for more robustmath embedding approaches is observed.
Incorporating Word Sense Disambiguation in Neural Language Models
- Computer ScienceArXiv
- 2021
Two supervised (pre-)training methods are presented that incorporate gloss definitions from lexical resources to leverage Word Sense Disambiguation capabilities in neural language models and exceed state-of-the-art techniques on the SemEval and Senseval datasets.
Specialized Document Embeddings for Aspect-based Similarity of Research Papers
- Computer Science2022 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
- 2022
The approach of aspect-based document embeddings mitigates potential risks arising from implicit biases by making them explicit and can, for example, be used for more diverse and explainable recommendations.
FSPRM: A Feature Subsequence Based Probability Representation Model for Chinese Word Embedding
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2021
A Feature Subsequence based Probability Representation Model (FSPRM) is proposed for learning Chinese word embeddings, in which the morphological and phonetic features of Chinese characters are integrated and their relevance is considered by designing a feature subsequence.
Identifying Machine-Paraphrased Plagiarism
- Computer ScienceiConference
- 2022
It is shown that the automated classification alleviates shortcomings of widely-used text-matching systems, such as Turnitin and PlagScan, and is evaluated for the detection of machine-paraphrased text using pre-trained word embedding models combined with state-of-the-art neural language models.
Word-Embedding-Based Traffic Document Classification Model for Detecting Emerging Risks Using Sentiment Similarity Weight
- Computer ScienceIEEE Access
- 2020
Through word imputation using an established similarity dictionary and by widening the limited utilization range, the proposed method overcomes the disadvantage of sentiment dictionaries and enables the detection of emerging risks.
A multi-dimensional relation model for dimensional sentiment analysis
- Computer ScienceInf. Sci.
- 2021
Fake or not? Automated detection of COVID-19 misinformation and disinformation in social networks and digital media
- Computer ScienceComputational and mathematical organization theory
- 2022
This work aggregated several COVID-19 misinformation datasets and compared differences between learning models from individual datasets versus one that was aggregated, and evaluated the impact of using several word- and sentence-embedding models and transformers on the performance of classification models.
Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection
- Computer ScienceiConference
- 2022
It is shown tokenizers and models tailored to COVID-19 data do not provide a significant advantage over general-purpose ones, and a broad spectrum of datasets and models are evaluated to benefit future research in developing misinformation detection systems.
References
SHOWING 1-10 OF 54 REFERENCES
Multi-sense embeddings through a word sense disambiguation process
- Computer ScienceExpert Syst. Appl.
- 2019
Embedding Words and Senses Together via Joint Knowledge-Enhanced Training
- Computer ScienceCoNLL
- 2017
This work proposes a new model which learns word and sense embeddings jointly and exploits large corpora and knowledge from semantic networks in order to produce a unified vector space of word and senses.
Towards Lexical Chains for Knowledge-Graph-based Word Embeddings
- Computer ScienceRANLP
- 2017
This work exploits Lexical Chain based templates over Knowledge Graph for generating pseudo-corpora with controlled linguistic value and shows that, on the one hand, the incorporation of many-relation lexical chains improves results, but on the other hand, unrestricted-length chains remain difficult to handle with respect to their huge quantity.
Embeddings for Word Sense Disambiguation: An Evaluation Study
- Computer ScienceACL
- 2016
This work proposes different methods through which word embeddings can be leveraged in a state-of-the-art supervised WSD system architecture, and performs a deep analysis of how different parameters affect performance.
Semantic Feature Structure Extraction From Documents Based on Extended Lexical Chains
- Computer ScienceGWC
- 2018
This paper explores the degree of cohesion among a document’s words using lexical chains as a semantic representation of its meaning using WordNet as a lexical database and develops a text document representation that can be used for semantic document retrieval.
Lexical Chains meet Word Embeddings in Document-level Statistical Machine Translation
- Computer ScienceDiscoMT@EMNLP
- 2017
This work proposes a method that benefits from the semantic similarity in lexical chains to improve SMT output by integrating it in a document-level decoder and focuses on word embeddings to deal with theLexical chains, contrary to the traditional approach that uses lexical resources.
De-Conflated Semantic Representations
- Computer ScienceEMNLP
- 2016
This work proposes a technique that tackles semantic representation problems by de-conflating the representations of words based on the deep knowledge it derives from a semantic network, including its high coverage and the ability to generate accurate representations even for infrequent word senses.
Bag of meta-words: A novel method to represent document for the sentiment classification
- Computer ScienceExpert Syst. Appl.
- 2018
Enriching Word Vectors with Subword Information
- Computer ScienceTACL
- 2017
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
Deep Contextualized Word Representations
- Computer ScienceNAACL
- 2018
A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals.