• Publications
  • Influence
Graph structure in the Web
TLDR
The study of the web as a graph yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. Expand
Pig latin: a not-so-foreign language for data processing
TLDR
A new language called Pig Latin is described, designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce, which is an open-source, Apache-incubator project, and available for general use. Expand
Propagation of trust and distrust
TLDR
It is shown that a small number of expressed trusts/distrust per individual allows us to predict trust between any two people in the system with high accuracy. Expand
Trawling the Web for Emerging Cyber-Communities
TLDR
The subject of this paper is the systematic enumeration of over 100,000 emerging communities from a Web crawl, motivating a graph-theoretic approach to locating such communities, and describing the algorithms and algorithmic engineering necessary to find structures that subscribe to this notion. Expand
Microscopic evolution of social networks
TLDR
A complete model of network evolution, where nodes arrive at a prespecified rate and select their lifetimes, and the combination of the gap distribution with the node lifetime leads to a power law out-degree distribution that accurately reflects the true network in all four cases is presented. Expand
Structure and evolution of online social networks
TLDR
A simple model of network growth is presented, characterizing users as either passive members of the network; inviters who encourage offline friends and acquaintances to migrate online; and linkers who fully participate in the social evolution of thenetwork. Expand
The Web as a Graph: Measurements, Models, and Methods
TLDR
This paper describes two algorithms that operate on the Web graph, addressing problems from Web search and automatic community discovery, and proposes a new family of random graph models that point to a rich new sub-field of the study of random graphs, and raises questions about the analysis of graph algorithms on the Internet. Expand
Evolutionary clustering
TLDR
This work presents a generic framework for clustering data over time, and discusses evolutionary versions of two widely-used clustering algorithms within this framework: k-means and agglomerative hierarchical clustering. Expand
Stochastic models for the Web graph
TLDR
The results are two fold: it is shown that graphs generated using the proposed random graph models exhibit the statistics observed on the Web graph, and additionally, that natural graph models proposed earlier do not exhibit them. Expand
Information diffusion through blogspace
TLDR
A macroscopic characterization of topic propagation through the authors' corpus, formalizing the notion of long-running "chatter" topics consisting recursively of "spike" topics generated by outside world events, or more rarely, by resonances within the community. Expand
...
1
2
3
4
5
...