• Publications
  • Influence
You are where you tweet: a content-based approach to geo-locating twitter users
A probabilistic framework for estimating a Twitter user's city-level location based purely on the content of the user's tweets, which can overcome the sparsity of geo-enabled features in these services and enable new location-based personalized information services, the targeting of regional advertisements, and so on. Expand
Exploring Millions of Footprints in Location Sharing Services
It is found that LSS users follow the “Levy Flight” mobility pattern and adopt periodic behaviors; while geographic and economic constraints affect mobility patterns, so does individual social status; and Content and sentiment-based analysis of posts associated with checkins can provide a rich source of context for better understanding how users engage with these services. Expand
Uncovering social spammers: social honeypots + machine learning
It is found that the deployed social honeypots identify social spammers with low false positive rates and that the harvested spam data contains signals that are strongly correlated with observable profile features (e.g., content, friend information, posting patterns, etc.). Expand
Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter
This paper presents the first long-term study of social honeypots for tempting, profiling, and filtering content polluters in social media, and evaluates a wide range of features to investigate the effectiveness of automatic content polluter identification. Expand
A Large-Scale Study of MySpace: Observations and Implications for Online Social Networks
An extensive analysis of over 1.9 million MySpace profiles helps to understand who is using these networks and how they are being used and finds a number of surprising results. Expand
PageRank for ranking authors in co-citation networks
It is found that in the author co-citation network, citation rank is highly correlated with PageRank's with different damping factors and also with different PageRank algorithms; citation rank and PageRank are not significantly correlation with centrality measures; and h-index is not significantly correlated withcentrality measures. Expand
Tensor Completion Algorithms in Big Data Analytics
A modern overview of recent advances in tensor completion algorithms from the perspective of big data analytics characterized by diverse variety, large volume, and high velocity is provided. Expand
Location prediction in social media based on tie strength
A novel network-based approach for location estimation in social media that integrates evidence of the social tie strength between users for improved location estimation and significantly improves the results of location estimation relative to a state-of-the-art technique. Expand
Crowdturfers, Campaigns, and Social Media: Tracking and Revealing Crowdsourced Manipulation of Social Media
This paper identifies three classes of crowdturfers -- professional workers, casual workers, and middlemen -- and develops statistical user models to automatically differentiate these workers and regular social media users. Expand
Temporal Structure