#Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds

  title={\#Bieber + \#Blast = \#BieberBlast: Early Prediction of Popular Hashtag Compounds},
  author={Suman Kalyan Maity and Ritvik Saraf and Animesh Mukherjee},
  journal={Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work \& Social Computing},
Compounding of natural language units is a very common phenomena. In this paper, we show, for the first time, that Twitter hashtags which, could be considered as correlates of such linguistic units, undergo compounding. We identify reasons for this compounding and propose a prediction model that can identify with 77.07% accuracy if a pair of hashtags compounding in the near future (i.e., 2 months after compounding) shall become popular. At longer times T = 6, 10 months the accuracies are 77.52… 

Hybrid Hashtags: #YouKnowYoureAKiwiWhen Your Tweet Contains Māori and English

Findings from the analysis of a large-scale diachronic corpus of over one million tweets, containing loanwords from te reo Māori, the indigenous language spoken in New Zealand, into (primarily, New Zealand) English are reported.

A qualitative analysis of sarcasm, irony and related #hashtags on Twitter

The prevalence of sarcastic and ironic language within social media posts is assessed and the need for future research studies to rethink their approach to data preparation and a more careful interpretation of sentiment analysis is highlighted.

Understanding Popularity of Social Media Entities: From hashtags to question topics

  • S. Maity
  • Computer Science
    CSCW Companion
  • 2017
With the evolution of social media over the years, entities like hashtags, tags, topics have also evolved and adopted traits that are very similar to various natural language units. One of the

A deep-learning framework to detect sarcasm targets

A deep learning framework augmented with socio-linguistic features to detect sarcasm targets in sarcastic book-snippets and tweets is presented and a huge improvement in the performance is achieved in terms of exact match and dice scores compared to the current state-of-the-art baseline.

TrollHunter2020: Real-time Detection of Trolling Narratives on Twitter During the 2020 U.S. Elections

The results suggest that the TrollHunter2020 indeed captures the emerging trolling narratives in a very early stage of an unfolding polarizing event in the 2020 U.S. elections.

RDV: An Easy to Use Data Visualisation Tool for Reddit

The design and implementation of the RDV (Reddit Data Visualisation) platform is reported, a visualisation tool aimed at facilitating the analysis of a publicly available Reddit dataset, which contains ~1.7 billion JSON objects collected from October 2007 to October 2015.

Understanding Book Popularity on Goodreads

It is successful in predicting the popularity of the books with high prediction accuracy (correlation coefficient ~0.61) and low RMSE (~1.25).

Cascades: A View from Audience

The results suggest that together these two effects enable the audience to consume a high quality stream of content in the presence of cascades, and highlight the balance between retweeting as a high-quality content selection mechanism and the role of network users in filtering irrelevant content.

Tools and frameworks for data abstraction in a performance context

This work implemented two versions of the solution to the Twitter Trend Prediction problem, aiming to map the process of taking a common machine learning engine into a streaming context, and implemented a D4M.jl interface with an emerging database technology, TileDB.



On predicting the popularity of newly emerging hashtags in Twitter

This article proposes methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task and shows that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features.

Analyzing the Dynamic Evolution of Hashtags on Twitter: a Language-Based Approach

A linguistic-inspired study of how hashtags are created, used and disseminated by the members of information networks, and the understanding of formation patterns of successful hashtags in Twitter can be useful to increase the effectiveness of real-time streaming search algorithms.

The 'hashtag': A new word or a new rule?

This paper analyzes hashtagging as a productive process of wordformation in English and Italian, both online and offline, based on samples of hashtags from a corpus of tweets and samples appearing in the offline world.

Will this #hashtag be popular tomorrow?

This paper constructs a hashtag profile using the tweets containing the hashtag, and extracts both content and context features for hashtag popularity prediction, and evaluates the effectiveness of the extracted features and classification models.

What's in a hashtag?: content based prediction of the spread of ideas in microblogging communities

An efficient hybrid approach based on a linear regression for predicting the spread of an idea in a given time frame is presented and it is shown that a combination of content features with temporal and topological features minimizes prediction error.

We know what @you #tag: does the dual role affect hashtag adoption?

This work proposes comprehensive measures to quantify the major factors of how a user selects content tags as well as joins communities, and proves the effectiveness of the dual role, where both the content measures and the community measures significantly correlate to hashtag adoption on Twitter.

Predicting bursts and popularity of hashtags in real-time

This paper studies the problems of real-time prediction of bursting hashtags and proposes solutions to these challenging problems based on empirical analysis of data collected from Twitter.

Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters

This work systematically evaluates the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy on Twitter and achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks.

Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter

The first large-scale validation of the "complex contagion" principle from sociology, which posits that repeated exposures to an idea are particularly crucial when the idea is in some way controversial or contentious, is provided.

Dude, srsly?: The Surprisingly Formal Nature of Twitter's Language

Twitter's language is surprisingly more conservative, and less informal than SMS and online chat; Twitter users appear to be developing linguistically unique styles; and Twitter has less variation of affect than other more formal mediums.