• Publications
  • Influence
The Anatomy of the Facebook Social Graph
TLDR
A strong effect of age on friendship preferences as well as a globally modular community structure driven by nationality are observed, but it is shown that while the Facebook graph as a whole is clearly sparse, the graph neighborhoods of users contain surprisingly dense structure.
Four degrees of separation
TLDR
The first world-scale social-network graph-distance computations, using the entire Facebook network of active users, and the average distance is 4:74, corresponding to 3:74 intermediaries or "degrees of separation", prompting the title of this paper.
Structural diversity in social contagion
TLDR
This analysis of the growth of Facebook shows how data at the size and resolution of the Facebook network make possible the identification of subtle structural signals that go undetected at smaller scales yet hold pivotal predictive roles for the outcomes of social processes.
Design and Analysis of Experiments in Networks: Reducing Bias from Interference
TLDR
This work evaluates methods for designing and analyzing randomized experiments under minimal, realistic assumptions compatible with broad interference, finding substantial bias reductions and, despite a bias–variance tradeoff, error reductions.
Graph cluster randomization: network exposure to multiple universes
TLDR
It is shown that proper cluster randomization can lead to exponentially lower estimator variance when experimentally measuring average treatment effects under interference, and if a graph satisfies a restricted-growth condition on the growth rate of neighborhoods, then there exists a natural clustering algorithm, based on vertex neighborhoods, for which the variance of the estimator can be upper bounded by a linear function of the degrees.
Restreaming graph partitioning: simple versatile algorithms for advanced balancing
TLDR
This work introduces restreaming graph partitioning and develops algorithms that scale similarly to streaming partitioning algorithms yet empirically perform as well as fully offline algorithms.
Balanced label propagation for partitioning massive graphs
TLDR
This work introduces an efficient algorithm, balanced label propagation, for precisely partitioning massive graphs while greedily maximizing edge locality, the number of edges that are assigned to the same shard of a partition.
Configuring Random Graph Models with Fixed Degree Sequences
TLDR
This work studies the subtle but important decisions underlying the specification of a configuration model, and investigates the role these choices play in graph sampling procedures and a suite of applications, placing particular emphasis on the importance of specifying the appropriate graph labeling under which to consider a null model.
On the Interplay between Social and Topical Structure
TLDR
The interface of two decisive structures forming the backbone of online social media is examined: the graph structure of social networks - who connects with whom - and the set structure of topical affiliations - who is interested in what, and computationally simple structural determinants can provide remarkable performance in both tasks.
Block models and personalized PageRank
TLDR
A principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model is developed and the optimal gradient is derived, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter α that depends on the block model parameters.
...
...