Democrats, republicans and starbucks afficionados: user classification in twitter

  title={Democrats, republicans and starbucks afficionados: user classification in twitter},
  author={Marco Pennacchiotti and Ana Maria Popescu},
More and more technologies are taking advantage of the explosion of social media (Web search, content recommendation services, marketing, ad targeting, etc.). This paper focuses on the problem of automatically constructing user profiles, which can significantly benefit such technologies. We describe a general and robust machine learning framework for large-scale classification of social media users according to dimensions of interest. We report encouraging experimental results on 3 tasks with… 

Figures and Tables from this paper

Identification of extremism on Twitter
  • Yifang Wei, L. Singh, S. Martin
  • Computer Science
    2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
  • 2016
This paper explores different Twitter metrics as proxies for misbehavior, including the sentiment of a user's published tweets, the polarity of the user's ego-network, and user mentions, and finds that combining all these features leads to the highest accuracy for detecting extremism on Twitter.
Religious Politicians and Creative Photographers: Automatic User Categorization in Twitter
This work sets out to automatically infer professions and personality related attributes for Twitter users based on features extracted from their content, their interaction networks, attributes of their friends and their activity patterns.
#greysanatomy vs. #yankees: Demographics and Hashtag Use on Twitter
This work uses state-of-the-art face analysis software to infer gender, age, and race from profile images of 350K Twitter users from New York for the period from November 1, 2014 to October 31, 2015.
Predicting the Topical Stance and Political Leaning of Media using Tweets
A cascaded method that uses unsupervised learning to ascertain the stance of Twitter users with respect to a polarizing topic by leveraging their retweet behavior; then, it uses supervised learning based on user labels to characterize both the general political leaning of online media and of popular Twitter users.
What's in Your Tweets? I Know Who You Supported in the UK 2010 General Election
The experimental results showed that the best-performing classification method - which uses the number of Twitter messages referring to a particular political party - achieved about 86% classification accuracy without any training phase.
Steeler nation, 12th man, and boo birds: Classifying Twitter user interests using time series
  • Tao Yang, Dongwon Lee, Su Yan
  • Computer Science
    2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013)
  • 2013
This work generates time series from tweets by exploiting the latent temporal information and solve the classification problem in time series domain by applying binary and multi-class approaches to the problem of Twitter user classification using the contents of tweets.
Using Twitter to Predict Voting Behavior
Twitter provides an excellent platform for solving political classification problems and if these advantages can be leveraged to predict how Twitter users will vote, this information could be used to predict an election outcome or predict the likely impact of real-time events on voting patterns.
Efficient User Profiling in Twitter Social Network Using Traditional Classifiers
An efficient supervised machine learning approach which categorizes Twitter users based on three important features into six interest categories, and proposes a design for a real-time system for Twitter user profiling along with a prototype implementation.
Social Analysis of Young Basque Speaking Communities in Twitter
The main objective is to combine demographic inference and social analysis in order to detect young Basque Twitter users and to identify the communities that arise from their relationships or shared content.
Member Classification and Party Characteristics in Twitter during UK Election
In modern politics, parties and individual candidates must have an online presence and usually have dedicated social media coordinators. In this context, real time member classification and party


Characterizing Microblogs with Topic Models
A scalable implementation of a partially supervised learning model (Labeled LDA) that maps the content of the Twitter feed into dimensions that correspond roughly to substance, style, status, and social characteristics of posts is presented.
Why we twitter: understanding microblogging usage and communities
It is found that people use microblogging to talk about their daily activities and to seek or share information and the user intentions associated at a community level are analyzed to show how users with similar intentions connect with each other.
Classifying latent user attributes in twitter
A novel investigation of stacked-SVM-based classification algorithms over a rich set of original features, applied to classifying these four user attributes, as distinct from the other primarily spoken genres previously studied in the user-property classification literature.
Detecting Spammers on Twitter
This paper uses tweets related to three famous trending topics from 2009 to construct a large labeled collection of users, manually classified into spammers and non-spammers, and identifies a number of characteristics related to tweet content and user social behavior which could potentially be used to detect spammers.
Robust Sentiment Detection on Twitter from Biased and Noisy Data
In this paper, we propose an approach to automatically detect sentiments on Twitter messages (tweets) that explores some characteristics of how tweets are written and meta-information of the words
Unsupervised Modeling of Twitter Conversations
This work proposes the first unsupervised approach to the problem of modeling dialogue acts in an open domain, trained on a corpus of noisy Twitter conversations, and addresses the challenge of evaluating the emergent model with a qualitative visualization and an intrinsic conversation ordering task.
Crystal: Analyzing Predictive Opinions on the Web
An election prediction system based on web users’ opinions posted on an election prediction website, which significantly outperforms several baselines as well as a non-generalized n-gram approach and proposes a novel technique which generalizes n- gram feature patterns.
Inferring gender of movie reviewers: exploiting writing style, content and metadata
It is found that the perceived utility of a review is an important correlate of gender, and content and stylistic features can be augmented with metadata to compensative for the brevity of reviews.
Recognizing Stances in Ideological On-Line Debates
This work constructs an arguing lexicon automatically from a manually annotated corpus and builds supervised systems employing sentiment and arguing opinions and their targets as features, which perform substantially better than a distribution-based baseline.
The demographics of web search
The research combines three data sources: the query log of a major US-based web search engine, profile information provided by 28 million of its users, and US-census information including detailed demographic information aggregated at the level of ZIP code, which creates a powerful user modeling tool.