Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

@article{Chen2019EmojiPoweredRL,
  title={Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification},
  author={Zhenpeng Chen and Sheng Shen and Ziniu Hu and Xuan Lu and Qiaozhu Mei and Xuanzhe Liu},
  journal={The World Wide Web Conference},
  year={2019}
}
Sentiment classification typically relies on a large amount of labeled data. In practice, the availability of labels is highly imbalanced among different languages, e.g., more English texts are labeled than texts in any other languages, which creates a considerable inequality in the quality of related information services received by users speaking different languages. To tackle this problem, cross-lingual sentiment classification approaches aim to transfer knowledge learned from one language… 

Figures and Tables from this paper

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification (Extended Abstract)
TLDR
A novel representation learning method that uses emoji prediction as an instrument to learn respective sentiment-aware representations for each language that are integrated to facilitate crosslingual sentiment classification.
SEntiMoji: an emoji-powered learning approach for sentiment analysis in software engineering
TLDR
Emotional emojis are employed as noisy labels of sentiments and a representation learning approach is proposed that uses both Tweets and GitHub posts containing emojiis to learn sentiment-aware representations for SE-related texts to achieve significant improvement on representative benchmark datasets.
Zero-Shot Learning for Cross-Lingual News Sentiment Classification
TLDR
The proposed approach achieves state-of-the-art performance on the sentiment analysis task on Slovenian news and largely outperforms the majority classifier, as well as all settings without sentiment enrichment in pre-training.
English-Malay Word Embeddings Alignment for Cross-lingual Emotion Classification with Hierarchical Attention Network
TLDR
This paper bridges the language gap between English and Malay through cross-lingual word embeddings constructed using singular value decomposition using a hierarchical attention model pre-trained using English tweets and fine-tuned it using a set of gold standard Malay tweets.
A Review on Multi-Lingual Sentiment Analysis by Machine Learning Methods
TLDR
This paper attempts to provide a detailed study on the sentiment analysis methods applied on languages other than English, covering methods that analyze translated data as well as methods that analyzed available data in the target language.
Multilingual emoji prediction using BERT for sentiment analysis
TLDR
A new model that learns from sentences using emojis as labels is proposed, collecting English and Japanese tweets from Twitter as the corpus and a bidirectional transformer is found suitable for emoji prediction, the first attempt of comparison at emoji prediction between Japanese and English is made.
Bridging the domain gap in cross-lingual document classification
TLDR
It is shown that addressing the domain gap is crucial in XLU and state-of-the-art cross-lingual methods are combined with recently proposed methods for weakly supervised learning such as unsupervised pre-training and unsuper supervised data augmentation to simultaneously close both the language gap and thedomain gap.
Emoji-powered Sentiment and Emotion Detection from Software Developers’ Communication Data
TLDR
This article leverages Tweets and GitHub posts containing emojis to learn representations of SE-related texts through emoji prediction and leverages the sentiment-aware representations as well as manually labeled data to learn the final sentiment/emotion classifier via transfer learning.
A survey of sentiment analysis in the Portuguese language
  • D. Pereira
  • Computer Science
    Artificial Intelligence Review
  • 2020
TLDR
This paper categorizes and describes state of the art works involving approaches to each of the tasks of sentiment analysis, as well as supporting language resources such as natural language processing tools, lexicons, corpora, ontologies, and datasets.
Analyzing the Sensitivity of Deep Neural Networks for Sentiment Analysis: A Scoring Approach
TLDR
A scoring function is applied to rank words importance without depending on the parameters or structure of the deep neural model to identify the model’s weakness and perturb words to craft targeted attacks that exploit this weakness.
...
1
2
3
4
...

References

SHOWING 1-10 OF 100 REFERENCES
Attention-based LSTM Network for Cross-Lingual Sentiment Classification
TLDR
An attention-based bilingual representation learning model which learns the distributed semantics of the documents in both the source and the target languages and proposes a hierarchical attention mechanism for the bilingual LSTM network.
Learning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification
TLDR
The proposed BSWE incorporate sentiment information of text into bilingual embeddings, and can learn high-quality BSWE by simply employing labeled corpora and their translations, without relying on largescale parallel corpora.
Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning
TLDR
This study proposes a representation learning approach which simultaneously learns vector representations for the texts in both the source and the target languages and shows that BiDRL outperforms the state-of-the-art methods for all the target language methods.
Modeling Language Discrepancy for Cross-Lingual Sentiment Analysis
TLDR
This paper aims to model the language discrepancy in sentiment expressions as intrinsic bilingual polarity correlations (IBPCs) for better cross-lingual sentiment analysis and demonstrates the superiority of the proposed models against several state-of-the-art alternatives.
Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification
TLDR
This paper presents a novel approach for multi-lingual sentiment classification in short texts by leveraging large amounts of weakly-supervised data in various languages to train a multi-layer convolutional network and demonstrates the importance of using pre-training of such networks.
Emoticon Smoothed Language Models for Twitter Sentiment Analysis
TLDR
A novel model, called emoticon smoothed language model (ESLAM), is presented, which is to train a language model based on the manually labeled data, and then use the noisy emoticon data for smoothing.
Linguistically Regularized LSTM for Sentiment Classification
TLDR
Results show that the proposed simple models are able to capture the linguistic role of sentiment words, negation words, and intensity words in sentiment expression.
Semi-Supervised Representation Learning for Cross-Lingual Text Classification
TLDR
This paper proposes a new crosslingual adaptation approach for document classification based on learning cross-lingual discriminative distributed representations of words to maximize the loglikelihood of the documents from both language domains under aCrosslingual logbilinear document model, while minimizing the prediction log-losses of labeled documents.
How Translation Alters Sentiment
TLDR
A state-of-the-art Arabic sentiment analysis system, a new dialectal Arabic sentiment lexicon, and the first Arabic-English parallel corpus that is independently annotated for sentiment by Arabic and English speakers are created.
Sentence-level Sentiment Classification with Weak Supervision
TLDR
The contextual information of sentences and words extracted from unlabeled sentences is incorporated into the approach to enhance the learning of sentiment classifier and experiments show that the approach can effectively improve the performance of sentence-level sentiment classification.
...
1
2
3
4
5
...