Utilizing Hashtags for Sentiment Analysis of Tweets in The Political Domain
What people say on social media has turned into a rich source of information to understand social behavior. Sentiment analysis of Twitter data has been widely used to capture trends in public opinion regarding important events such as political elections. However, current research in social media analysis in political domains faces two major problems, namely: sentiment analysis methods implemented are often too simple, and most of the works assume that all users and their tweets are trustworthy. This research is aimed at dealing with these problems to achieve more reliable public opinion measurements. First, a dataset of 513K tweets referring to Colombia 2014 presidential election was collected. To distinguish spammer accounts from non-spammer ones, a supervised learning technique was implemented on a labeled collection of users. Next, a sentiment analysis system was developed by following a supervised classification approach. Lastly, the system was applied in the Colombian election to investigate the potential of social media for voting intention inference. Experimental results show that inference methods based on Twitter data are not consistent, despite obtaining the lowest mean absolute error and correctly ranking the highest-polling candidates in the first round election with the proposed inference method.