Credibility ranking of tweets during high impact events


Twitter has evolved from being a conversation or opinion sharing medium among friends into a platform to share and disseminate information about current events. Events in the real world create a corresponding spur of posts (tweets) on Twitter. Not all content posted on Twitter is trustworthy or useful in providing information about the event. In this paper, we analyzed the credibility of information in tweets corresponding to fourteen high impact news events of 2011 around the globe. From the data we analyzed, on average 30% of total tweets posted about an event contained situational information about the event while 14% was spam. Only 17% of the total tweets posted about the event contained situational awareness information that was credible. Using regression analysis, we identified the important content and sourced based features, which can predict the credibility of information in a tweet. Prominent content based features were number of unique characters, swear words, pronouns, and emoticons in a tweet, and user based features like the number of followers and length of username. We adopted a supervised machine learning and relevance feedback approach using the above features, to rank tweets according to their credibility score. The performance of our ranking algorithm significantly enhanced when we applied re-ranking strategy. Results show that extraction of credible information from Twitter can be automated with high confidence.

DOI: 10.1145/2185354.2185356

6 Figures and Tables

Citations per Year

128 Citations

Semantic Scholar estimates that this publication has 128 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Gupta2012CredibilityRO, title={Credibility ranking of tweets during high impact events}, author={Aditi Gupta and Ponnurangam Kumaraguru}, booktitle={PSOSM '12}, year={2012} }