Emoticon Smoothed Language Models for Twitter Sentiment Analysis

  title={Emoticon Smoothed Language Models for Twitter Sentiment Analysis},
  author={Kun Liu and Wu-Jun Li and Minyi Guo},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  • Kun Liu, Wu-Jun Li, M. Guo
  • Published 22 July 2012
  • Computer Science
  • Proceedings of the AAAI Conference on Artificial Intelligence
Twitter sentiment analysis (TSA) has become a hot research topic in recent years. The goal of this task is to discover the attitude or opinion of the tweets, which is typically formulated as a machine learning based text classification problem. Some methods use manually labeled data to train fully supervised models, while others use some noisy labels, such as emoticons and hashtags, for model training. In general, we can only get a limited number of training data for the fully supervised… 
TweetGrep: Weakly Supervised Joint Retrieval and Sentiment Analysis of Topical Tweets
Experiments show that TweetGrep beats the state-ofthe-art models for both the tasks of retrieving topical tweets and analyzing the sentiment of the tweets (average improvement of 4.97% and 6.91% respectively in terms of area under the curve).
TASC:Topic-Adaptive Sentiment Classification on Dynamic Tweets
A semi-supervised topic-adaptive sentiment classification (TASC) model, which starts with a classifier built on common features and mixed labeled data from various topics and beats other well-known supervised and ensemble classifiers without feature adaption.
Acquiring and Exploiting Lexical Knowledge for Twitter Sentiment Analysis
This thesis addresses the label sparsity problem for Twitter polarity classification by automatically building two type of resources that can be exploited when labelled data is scarce: opinion lexicons, which are lists of words labelled by sentiment, and synthetically labelled tweets.
Sentiment Analysis on Twitter through Topic-Based Lexicon Expansion
This approach provides a way to do domain-dependent sentiment analysis without the cost of data annotation and leads to statistically significant improvements in classification accuracies across 56 topics with a state-of-the-art lexicon-based classifier.
Meta-level sentiment models for big social data analysis
This paper proposes a novel model for Twitter Sentiment Classification and explores binary classification which is classified data set into positive and negative classes, and confirmed the superiority of the proposed model over the state-of-the-art systems.
Annotate-Sample-Average (ASA): A New Distant Supervision Approach for Twitter Sentiment Analysis
Annotate-Sample-Average (ASA) is proposed, a distant supervision method that uses large amounts of unlabelled tweets obtained from the Twitter API and prior lexical knowledge in the form of opinion lexicons to generate synthetic training data for Twitter polarity classification.
Distantly Supervised Lifelong Learning for Large-Scale Social Media Sentiment Analysis
The results prove that the lifelong sentiment learning approach is feasible and effective to tackle the challenges of continuously updated texts with dynamic topics in social media and proves that the belief “the more training data the better performance” does not hold in large-scale social media sentiment analysis.
This paper compares three emoticon pre-processing methods: emotion deletion, emoticons 2-valued translation and emoticon explanation and proposes a method based on emoticon-weight lexicon, and conducts experiments based on Naive Bayes classifier to validate the crucial role emoticons play on guiding emotion tendency in a tweet.
This paper focuses on how to fuse textual information of Twitter messages and sentiment Analysis patterns to obtain better performance on Twitter data.


Effective sentiment stream analysis with self-augmenting training and demand-driven projection
The heart of the approach is a training augmentation procedure which takes as input a small training seed, and then it automatically incorporates new relevant messages to the training data, so that at any given time the model properly reflects the sentiments in the event being analyzed.
Enhanced Sentiment Learning Using Twitter Hashtags and Smileys
A supervised sentiment classification framework which is based on data from Twitter, a popular microblogging service, is proposed, utilizing 50 Twitter tags and 15 smileys as sentiment labels, allowing identification and classification of diverse sentiment types of short texts.
From bias to opinion: a transfer-learning approach to real-time sentiment analysis
This paper adopted user bias as the basis for building accurate classification models and applied its model to posts collected from Twitter on two topics: the 2010 Brazilian Presidential Elections and the 2010 season of Brazilian Soccer League.
Target-dependent Twitter Sentiment Classification
This paper proposes to improve target-dependent Twitter sentiment classification by incorporating target- dependent features; and taking related tweets into consideration; and according to the experimental results, this approach greatly improves the performance of target- dependence sentiment classification.
Collective Semantic Role Labeling for Tweets with Clustering
This work proposes a new method to collectively label similar tweets and shows that this approach remarkably improves SRL by 3.1% F1.
User-level sentiment analysis incorporating social networks
It is shown that information about social relationships can be used to improve user-level sentiment analysis and incorporating social-network information can indeed lead to statistically significant sentiment classification improvements over the performance of an approach based on Support Vector Machines having access only to textual features.
Twitter power: Tweets as electronic word of mouth
It is found that microblogting is an online tool for customer word of mouth communications and the implications for corporations using microblogging as part of their overall marketing strategy are discussed.
Robust Sentiment Detection on Twitter from Biased and Noisy Data
In this paper, we propose an approach to automatically detect sentiments on Twitter messages (tweets) that explores some characteristics of how tweets are written and meta-information of the words
Twitter Sentiment Analysis: The Good the Bad and the OMG!
This paper evaluates the usefulness of existing lexical resources as well as features that capture information about the informal and creative language used in microblogging, and uses existing hashtags in the Twitter data for building training data.
Classifying sentiment in microblogs: is brevity an advantage?
Surprisingly, it is found classifying sentiment in microblogs easier than in blogs and a number of observations pertaining to the challenge of supervised learning for sentiment analysis in micro blogs are made.