• Corpus ID: 7437957

Sense-Level Semantic Clustering of Hashtags in Social Media

  title={Sense-Level Semantic Clustering of Hashtags in Social Media},
  author={Ali Javed and Byung Suk Lee},
We enhance the accuracy of the currently available semantic hashtag clustering method, which leverages hashtag semantics extracted from dictionaries such as Wordnet and Wikipedia. While immune to the uncontrolled and often sparse usage of hashtags, the current method distinguishes hashtag semantics only at the word level. Unfortunately, a word can have multiple senses representing the exact semantics of a word, and, therefore, word-level semantic clustering fails to disambiguate the true sense… 

Figures and Tables from this paper

Sense-Level Semantic Clustering of Hashtags
This paper enhances the accuracy of the currently available semantic hashtag clustering method, which leverages hashtag semantics extracted from dictionaries such as Wordnet and Wikipedia, and demonstrates its impacts on clustering behavior and accuracy.
Hybrid semantic clustering of hashtags
Real-Time Tweet Analytics Using Hybrid Hashtags on Twitter Big Data Streams
A novel semi-automated technique that derives semantically relevant hashtags using a domain-specific knowledge base of topic concepts and combines them with the existing tweet-based-hashtags to produce Hybrid Hashtags, a comprehensive framework that combines batch and online mechanisms in the most effective way.
Sentiment Analysis of Social Media Micro Blogs using Power Links and Genetic Algorithms
The approach aims to develop a non-supervised cluster agent that can correctly cluster micro blogs and define different interests of different groups of people.
SOMTimeS: Self Organizing Maps for Time Series Clustering and its Application to Serious Illness Conversations
A new DTW-based clustering method, called SOMTimeS (a Self-Organizing Map for TIME Series), that scales better and runs faster than otherDTW- based clustering algorithms, and has similar performance accuracy.


Temporal Semantics: Time-Varying Hashtag Sense Clustering
This paper proposes a sense clustering algorithm based on temporal mining for hashtags that is clustered based on string similarity and temporal co-occurrence, and performs a complexity evaluation of the algorithm.
Defining Semantic Meta-hashtags for Twitter Classification
This paper uses the user-defined hashtags as the Twitter message class labels and applies the meta-hashtag approach to boost the performance of the message classification by clustering similar messages to improve the classification.
Exploring the Meaning behind Twitter Hashtags through Clustering
This paper cluster a large set of hashtags using K-means on map reduce in order to process data in a distributed manner and retrieve connections that might exist between different hashtags and their textual representation, and grasp their semantics through the main topics they occur with.
Unsupervised semantic clustering of Twitter hashtags
This work presents a novel methodology, based on a semantic clustering of the set of hashtags, which permits to obtain automatically the topics associated to a given set of tweets.
Topical Clustering of Tweets
A study on automatically clustering and classifying Twitter messages into different categories, inspired by the approaches taken by news aggregating services like Google News, suggests that the clusters produced by traditional unsupervised methods can often be incoherent from a topical perspective.
Efficient Clustering of Short Messages into General Domains
Results show that the algorithm presented is both accurate andefficient and can be easily used for large scale clustering of sparse messages as the heavy lifting is achieved on a sublinear number of documents.
Extracting Semantic Knowledge from Twitter
This work argues that Twitter is a valuable data source for e-Participation related projects and describes other domains were Twitter has already been used and focuses on its own semantic-analysis framework based on the previously introduced Semantic Patterns concept.
Scalable multi stage clustering of tagged micro-messages
This work proposes SMSC -- a scalable, accurate and efficient multi stage clustering algorithm that leverages users practice of adding tags to some messages by bootstrapping over virtual non sparse documents.
Identification of Implicit Topics in Twitter Data Not Containing Explicit Search Queries
According to experiments, each one of the suggested serializing methods achieves higher means of average precision rates than baselines such as the query matching model and the tf-idf weighting model, which indicates that considering an individual tweet within a discourse context is helpful in judging its relevance to a given topic.
Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach
This study focuses on hashtag-level sentiment classification, which aims to automatically generate the overall sentiment polarity for a given hashtag in a certain time period, and proposes a novel graph model and three approximate collective classification algorithms for inference.