WTF: the who to follow service at Twitter

  title={WTF: the who to follow service at Twitter},
  author={Pankaj Gupta and Ashish Goel and Jimmy J. Lin and Aneesh Sharma and Dong Wang and Reza Bosagh Zadeh},
  journal={Proceedings of the 22nd international conference on World Wide Web},
WTF ("Who to Follow") is Twitter's user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. This paper provides an architectural overview and shares lessons we learned in building and running the service over the past few years. Particularly noteworthy was our design decision to process the entire Twitter graph in memory on a single server, which significantly reduced… 

Figures from this paper

Recommendation of microblog users based on hierarchical interest profiles

This paper proposes to accomplish the recommendation task in two steps, and casts both problems of user classification and recommendation as one of itemset mining, where items are either users’ authoritative friends or semantic categories associated to friends, extracted from WiBi, the Wikipedia Bitaxonomy.

SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter

This paper presents SimClusters, a general-purpose representation layer based on overlapping communities into which users as well as heterogeneous content can be captured as sparse, interpretable vectors to support a multitude of recommendation tasks.

TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation

This work investigates knowledge-graph embeddings for entities in the Twitter HIN (TwHIN) and shows that these pretrained representations yield significant offline and online improvement for a diverse range of downstream recommendation and classification tasks: personalized ads rankings, account follow-recommendation, offensive content detection, and search ranking.

WTF, GPU! computing twitter's who-to-follow on the GPU

This paper implements Twitter's WTF ("Who to Follow") recommendation system on a single GPU, showing promising results on moderate-sized social graphs and proposing possible solutions to apply the system to follow graphs of larger sizes that do not fit into the on-board memory of a singleGPU.

GraphJet: Real-Time Content Recommendations at Twitter

This paper presents GraphJet, an in-memory graph processing engine that maintains a real-time bipartite interaction graph between users and tweets and organizes the interaction graph into temporally-partitioned index segments that hold adjacency lists.

Semantic Enabled Recommender System for Micro-Blog Users

A method for analysis of Twitter users supported by a hierarchical representation of their interests, which is called a Twixonomy, and shows that semantic categories allow for very fine-grained population studies, and make it possible to recommend not only whom to follow, but also topics of interest, users interested in the same topic, and more.

A semantic followee recommender in Twitter using Topicmodel and Kalman filter

Preliminary analysis show that the model can effectively recommend useful followees in Twitter, and a case study is conducted to evaluate the efficacy of the model to recommend followingees in six predefined classes: politics, sports, business, entertainment, science, and travel.

Twitter-user recommender system using tweets: A content-based approach

  • R. NidhiB. Annappa
  • Computer Science
    2017 International Conference on Computational Intelligence in Data Science(ICCIDS)
  • 2017
With the advent of the internet into our everyday lives, online social networks such as Facebook and Twitter have taken up a major role in networking, information deployment and entertainment. As of

Twitter User Recommendation for Gaining Followers

A recommendation system that leverages features ranging from basic social media attributes to specialized, domain-relevant user profile attributes predicted from data using machine learning techniques is proposed and a preliminary analysis of its performance in gathering new followers in a Twitter scenario where the account manager follows recommended users to trigger their follow-back.



Fast Incremental and Personalized PageRank

The overall result is that this algorithm is fast enough for real-time queries over a dynamic social network.

Supervised random walks: predicting and recommending links in social networks

An algorithm based on Supervised Random Walks is developed that naturally combines the information from the network structure with node and edge level attributes and outperforms state-of-the-art unsupervised approaches as well as approaches that are based on feature extraction.

The Unified Logging Infrastructure for Data Analytics at Twitter

This paper presents Twitter's production logging infrastructure and its evolution from application-specific logging to a unified "client events" log format, where messages are captured in common, well-formatted, flexible Thrift messages.

Large-scale machine learning at twitter

A case study of Twitter's integration of machine learning tools into its existing Hadoop-based, Pig-centric analytics platform to provide predictive analytics capabilities that incorporate machine learning, focused specifically on supervised classification.

SALSA: the stochastic approach for link-structure analysis

It is proved that SALSA is quivalent to a weighted in degree analysis of the link-sturcutre of WWW subgraphs, making it computationally more efficient than the Mutual reinforcement approach, and comparisions reveal a topological Phenomenon called the TKC effect which prevents the Mutual Reinforcement approach from identifying meaningful authorities.

On the precision of social and information networks

This paper proves that the Kronecker-graph based generative model of Leskovec et al. satisfies an appropriate and natural definition of user interests, and shows that this model also has high precision, high recall, and low diameter.

Design patterns for efficient graph algorithms in MapReduce

Three design patterns are presented that address issues and can be used to accelerate a large class of graph algorithms based on message passing, exemplified by PageRank, and are shown to reduce the running time of PageRank on a web graph with 1.4 billion edges by 69%.

A Survey of Link Prediction in Social Networks

This article surveys some representative link prediction methods by categorizing them by the type of models, largely considering three types of models: first, the traditional (non-Bayesian) models which extract a set of features to train a binary classification model, and second, the probabilistic approaches which model the joint-probability among the entities in a network by Bayesian graphical models.

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

This paper employs approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities, and defines the network community profile plot, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales.

On compressing social networks

This work proposes simple combinatorial formulations that encapsulate efficient compressibility of graphs and shows that some of the problems are NP-hard yet admit effective heuristics, some of which can exploit properties of social networks such as link reciprocity.