Sentiment analysis in tweets: an assessment study from classical to modern word representation models
@article{Barreto2021SentimentAI, title={Sentiment analysis in tweets: an assessment study from classical to modern word representation models}, author={S'ergio Barreto and Ricardo Moura and Jonnathan Carvalho and A. Paes and Alexandre Plastino}, journal={Data Mining and Knowledge Discovery}, year={2021}, volume={37}, pages={318 - 380} }
With the exponential growth of social media networks, such as Twitter, plenty of user-generated data emerge daily. The short texts published on Twitter – the tweets – have earned significant attention as a rich source of information to guide many decision-making processes. However, their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks, including sentiment analysis. Sentiment classification is tackled…
3 Citations
A Resource-optimized and Accelerated Sentiment Analysis Method using Serverless Computing
- Computer ScienceProcedia Computer Science
- 2022
Semantic relational machine learning model for sentiment analysis using cascade feature selection and heterogeneous classifier ensemble
- Computer SciencePeerJ Comput. Sci.
- 2022
A Semantic Relational Machine Learning (SRML) model that automatically classifies the sentiment of tweets by using classifier ensemble and optimal features is proposed that achieved higher accuracy and outperforms more accomplished models employing quantum-inspired sentiment representation (QSR), transformer-based methods like BERT, BERTweet, RoBERTa and ensemble techniques.
Enriching datasets for sentiment analysis in tweets with instance selection
- Computer ScienceAnais do IX Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2021)
- 2021
Different strategies for selecting instances from a set of labeled source datasets in order to improve the performance of classifiers trained only with the target dataset are proposed, including similarity metrics and variations in the number of selected instances.
References
SHOWING 1-10 OF 59 REFERENCES
BERTweet: A pre-trained language model for English Tweets
- Computer ScienceEMNLP
- 2020
BERTweet is presented, the first public large-scale pre-trained language model for English Tweets, trained using the RoBERTa pre-training procedure, producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
- Computer ScienceArXiv
- 2019
It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
Deep Contextualized Word Representations
- Computer ScienceNAACL
- 2018
A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals.
Advances in Pre-Training Distributed Word Representations
- Computer ScienceLREC
- 2018
This paper shows how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together to outperform the current state of the art by a large margin on a number of tasks.
Distributed Representations of Words and Phrases and their Compositionality
- Computer ScienceNIPS
- 2013
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
From Frequency to Meaning: Vector Space Models of Semantics
- Computer ScienceJ. Artif. Intell. Res.
- 2010
The goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs, and to provide pointers into the literature for those who are less familiar with the field.
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
- Computer ScienceICLR
- 2020
This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Computer ScienceNAACL
- 2019
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
- Computer ScienceWASSA@EMNLP
- 2018
Emo2Vec is proposed which encodes emotional semantics into vectors and outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora.
Learning Emotion-enriched Word Representations
- Computer Science, PsychologyCOLING
- 2018
This work proposes a novel method of obtaining emotion-enriched word representations, which projects emotionally similar words into neighboring spaces and emotionally dissimilar ones far apart, and demonstrates that the proposed representations outperform several competitive general-purpose and affective word representations.