• Publications
  • Influence
Detecting Spammers on Twitter
TLDR
This paper uses tweets related to three famous trending topics from 2009 to construct a large labeled collection of users, manually classified into spammers and non-spammers, and identifies a number of characteristics related to tweet content and user social behavior which could potentially be used to detect spammers.
Capacity Planning for Web Services: Metrics, Models, and Methods
TLDR
Capacity Planning for Web Services: Metrics, Models, and Methods introduces quantitative performance predictive models for every major Web scenario, showing precisely how to identify and address both potential and actual performance problems.
Characterizing user behavior in online social networks
TLDR
A first of a kind analysis of user workloads in online social networks, based on detailed clickstream data collected over a 12-day period, shows that browsing, which cannot be inferred from crawling publicly available data, accounts for 92% of all user activities.
Characterizing reference locality in the WWW
TLDR
The authors propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers and show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and proposed models for typical distributions and compare their cache performance to the traces.
A methodology for workload characterization of E-commerce sites
TLDR
This paper introduces a state transition graph called Customer Behavior Model Graph (CBMG), that is used to describe the behavior of groups of customers who exhibit similar navigational patterns, and proposes a clustering algorithm to characterize workloads of e-commerce sites in terms of CBMGs.
Performance by Design - Computer Capacity Planning By Example
TLDR
Practical systems modeling: learning exactly how to map real-life systems to accurate performance models, and use those models to make better decisions--both up front and throughout the entire system lifecycle.
Capacity Planning and Performance Modeling: From Mainframes to Client-Server Systems
This example-driven, easy-to-read exploration of capacity planning of computer systems is designed to be accessible and relevant both to practising professionals and those with little mathematical
Detecting Spammers and Content Promoters in Online Video Social Networks
TLDR
This paper manually builds a test collection of real YouTube users, classifying them as spammers, promoters, and legitimates, and provides a characterization of social and content attributes that may help distinguish each user class.
On word-of-mouth based discovery of the web
TLDR
A detailed analysis of word-of-mouth exchange of URLs among Twitter users shows that Twitter yields propagation trees that are wider than they are deep, and indicates that users who are geographically close together are more likely to share the same URL.
Characterizing and Detecting Hateful Users on Twitter
TLDR
This work develops and employs a robust methodology to collect and annotate hateful users which does not depend directly on lexicon and where the users are annotated given their entire profile, and forms the hate speech detection problem as a task of semi-supervised learning over a graph, exploiting the network of connections on Twitter.
...
...