Classifying sentiment in microblogs: is brevity an advantage?

  title={Classifying sentiment in microblogs: is brevity an advantage?},
  author={Adam Bermingham and Alan F. Smeaton},
  journal={Proceedings of the 19th ACM international conference on Information and knowledge management},
  • A. BerminghamA. Smeaton
  • Published 26 October 2010
  • Computer Science
  • Proceedings of the 19th ACM international conference on Information and knowledge management
Microblogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. However, this short length coupled with their noisy nature can pose difficulties for standard machine learning document representations. In this work we examine the hypothesis that it is easier to classify the sentiment in these short form documents than in longer form documents. Surprisingly, we find classifying sentiment… 

Figures and Tables from this paper

Microblog Sentiment Classification with Contextual Knowledge Regularization

This paper proposes to use the microblogs' contextual knowledge mined from a large amount of unlabeled data to help improve microblog sentiment classification, and defines two kinds of contextual knowledge: word-word association and word-sentiment association.

Columbia NLP: Sentiment Detection of Subjective Phrases in Social Media

We present a supervised sentiment detection system that classifies the polarity of subjective phrases as positive, negative, or neutral. It is tailored towards online genres, specifically Twitter,

Polarity Classification for Target Phrases in Tweets: A Word2Vec Approach

Polar classification of tweets refers to the task of assigning a positive or a negative sentiment to an entire tweet, which is predicting the polarity of a specific target phrase, for instance @Microsoft or #Linux, which are contained in the tweet.

Leveraging Social Media Linguistic Features for Bilingual Microblog Sentiment Classification

A lexicon-based sentiment analysis algorithm that uses a unified approach for determining the sentiment of comments written in both languages and incorporates techniques that exploit the distinctive features of the language used in microblogs in order to accurately predict the sentiment expressed in microblog comments.

Open Domain Targeted Sentiment

The intuition behind this work is that sentiment expressed towards an entity, targeted sentiment, may be viewed as a span of sentiment expressed across the entity, and this representation allows us to model sentiment detection as a sequence tagging problem, jointly discovering people and organizations along with whether there is sentiment directed towards them.

CLUSM: An Unsupervised Model for Microblog Sentiment Analysis Incorporating Link Information

This paper is the first to divide the links between microblogs into three classes, and proposes an unsupervised model called Content and Link Unsupervised Sentiment Model (CLUSM), which focuses on microblog sentiment analysis by incorporating the above three types of links.

DeustoTech Internet at TASS 2015: Sentiment Analysis and Polarity Classification in Spanish Tweets

The system approaches the task 1 of the workshop for sentiment analysis TASS 2015, which consists on performing an automatic sentiment analysis to determine the global polarity of a set of tweets in Spanish, based on a model supervised Linear Support Vector Machines combined with some polarity lexicons.

Streaming Analytics

Different methods of text preprocessing are explained and applies them with a naive Bayes classifier in a big data, distributed computing platform with the goal of creating a scalable sentiment analysis solution that can classify text into positive or negative categories.

Performing sentiment analysis in Bangla microblog posts

This paper aims to automatically extract the sentiments or opinions conveyed by users from Bangla microblog posts and then identify the overall polarity of texts as either negative or positive.



A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

A novel machine-learning method is proposed that applies text-categorization techniques to just the subjective portions of the document, which greatly facilitates incorporation of cross-sentence contextual constraints.

Topic-dependent sentiment analysis of financial blogs

This work develops a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies, and proposes text extraction techniques to create topic-specific sub-documents, which are used to train a sentiment classifier.

Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis

It is demonstrated that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data by using large feature vectors in combination with feature reduction and the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy.

Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees

Text mining techniques are used to extract frequent word sub-sequences and dependency sub-trees from sentences in a document dataset and use them as features of support vector machines for document sentiment classification.

Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis

A new approach to phrase-level sentiment analysis is presented that first determines whether an expression is neutral or polar and then disambiguates the polarity of the polar expressions.

SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining

SENTIWORDNET is a lexical resource in which each WORDNET synset is associated to three numerical scores Obj, Pos and Neg, describing how objective, positive, and negative the terms contained in the synset are.

Characterizing debate performance via aggregated twitter sentiment

An analytical methodology and visual representations are developed that could help a journalist or public affairs person better understand the temporal dynamics of sentiment in reaction to the debate video.

Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena

It is speculated that large scale analyses of mood can provide a solid platform to model collective emotive trends in terms of their predictive value with regards to existing social as well as economic indicators.

The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection

This paper describes the creation of the Blogs06 collection by the University of Glasgow, and reports statistics of the collected data, and demonstrates how some characteristics of the collection vary across the spam and non-spam components.

How Much Noise Is Too Much: A Study in Automatic Text Classification

The goal of this paper is to bring out and study the effect of different kinds of noise on automatic text classification, and present interesting results on real-life noisy datasets from various CRM domains.