Fast distributed PageRank computation

@article{Sarma2015FastDP,
title={Fast distributed PageRank computation},
author={Atish Das Sarma and Anisur Rahaman Molla and Gopal Pandurangan and Eli Upfal},
journal={Theor. Comput. Sci.},
year={2015},
volume={561},
pages={113-121}
}
Over the last decade, PageRank has gained importance in a wide range of applications and domains, ever since it first proved to be effective in determining node importance in large graphs (and was a pioneering idea behind Google’s search engine). In distributed computing alone, PageRank vector, or more generally random walk based quantities have been used for several different applications ranging from determining important nodes, load balancing, search, and identifying connectivity structures… Expand
52 Citations
Improved Communication Cost in Distributed PageRank Computation - A Theoretical Study
• S. Luo
• Computer Science
• ICML
• 2020
A new algorithm is provided that uses asymptotically the same communication round complexity while using only O(d log n) bits of bandwidth. Expand
Massively Parallel Algorithms for Personalized PageRank
• Computer Science
• Proc. VLDB Endow.
• 2021
Delta-Push is an efficient framework for single-source and top-k PPR queries in distributed settings that reduces the number of rounds while guaranteeing that the load, i.e., the maximum number of messages an executor sends or receives in a round, can be bounded by the capacity of each executor. Expand
PageRank Computation via Web Aggregation in Distributed Randomized Algorithms
• Computer Science
• 2019 IEEE 58th Conference on Decision and Control (CDC)
• 2019
This paper presents extensions of the distributed algorithms which were recently proposed for the computation of PageRank that are modified for aggregation-based computation by grouping pages in the same domain. Expand
FAST-PPR: scaling personalized pagerank estimation for large graphs
• Computer Science, Mathematics
• KDD
• 2014
We propose a new algorithm, FAST-PPR, for computing personalized PageRank: given start node s and target node t in a directed graph, and given a threshold δ, it computes the Personalized PageRankExpand
Distributed Algorithms for Fully Personalized PageRank on Large Graphs
A novel study on the computation of fully edge-weighted PPR on large graphs using the distributed computing framework that employs the Monte Carlo approximation that performs a large number of random walks from each node of the graph, and exploits the parallel pipeline framework to reduce the overall running time of the fully PPR. Expand
Efficient Algorithms for Personalized PageRank
A new bidirectional algorithm which combines linear algebra and Monte Carlo to achieve significant speed improvements is presented, which is 70x faster than past state-of-the-art algorithms. Expand
On the Distributed Complexity of Large-Scale Graph Computations
• Computer Science
• ACM Trans. Parallel Comput.
• 2021
The General Lower Bound Theorem is established, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations, and two applications show that the approach can yield lower bounds for problems where the application of communication complexity techniques seems not obvious or gives weak bounds. Expand
Tight Bounds for Distributed Graph Computations
• Computer Science
• ArXiv
• 2016
The paper presents (almost) tight bounds for the round complexity of two fundamental graph problems, namely PageRank computation and triangle enumeration, and presents a distributed algorithm that enumerates all the triangles of a graph in $\tilde{O}(m/k^{5/3})$ rounds. Expand
On the Embeddability of Random Walk Distances
• Computer Science
• Proc. VLDB Endow.
• 2013
This paper investigates methods to scalably and efficiently compute random-walk distances, by "embedding" graphs and distances into points and distances in geometric coordinate spaces, and proposes a new graph embedding system that explicitly accounts for per-node graph properties that affect random walk. Expand
Distributively Computing Random Walk Betweenness Centrality in Linear Time
• Computer Science
• 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
• 2017
This paper proposes an O(n log n) time distributed randomized approximation algorithm for calculating each node's random walk betweenness centrality with an approximation ratio (1-∊ where n is the number of nodes and ∊ is an arbitrarily small constant between 0 and 1). Expand

References

SHOWING 1-10 OF 26 REFERENCES
Fast Distributed PageRank Computation
• Computer Science
• ICDCN
• 2013
This work designs provably efficient fully-distributed algorithms for computing PageRank using traditional matrix-vector multiplication style iterative methods, which may not always adapt well to the distributed setting owing to communication bandwidth restrictions and convergence rates. Expand
Fast Incremental and Personalized PageRank
• Computer Science
• Proc. VLDB Endow.
• 2010
The overall result is that this algorithm is fast enough for real-time queries over a dynamic social network. Expand
Fast personalized PageRank on MapReduce
• Computer Science
• SIGMOD '11
• 2011
It is shown that the number of MapReduce iterations used by the algorithm is optimal among a broad family of algorithms for the problem, and its I/O efficiency is much better than the existing candidates. Expand
Calculation of PageRank Over a Peer-To-Peer Network
Modern computer networks contain enormous quantities of unstructured data, and providing efficient methods of identifying interesting documents continues to be one of the largest challenges suchExpand
Distributed Random Walks
• Computer Science, Mathematics
• JACM
• 2013
A sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk and which is fully decentralized and can serve as a building block in the design of topologically-aware networks. Expand
Local Graph Partitioning using PageRank Vectors
• Mathematics, Computer Science
• 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
• 2006
An improved algorithm for computing approximate PageRank vectors, which allows us to find a cut with conductance at most oslash and approximately optimal balance in time O(m log4 m/oslash) in time proportional to its size. Expand
Distributed pagerank for P2P systems
• Computer Science
• High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on
• 2003
This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems, based on chaotic (asynchronous) iterative solution of linear systems, which provided approximately a ten-fold reduction in network traffic for two-word and three-word querying. Expand
Efficient distributed random walks with applications
• Computer Science
• PODC
• 2010
A fast sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk and which can serve as a building block in the design of topologically-aware networks is presented. Expand
Estimating PageRank on graph streams
• Mathematics, Computer Science
• JACM
• 2011
In the streaming model, this article shows how to perform several graph computations including estimating the probability distribution after a random walk of length l, the mixing time, and other related quantities such as the conductance of the graph. Expand
A Survey on PageRank Computing
The theoretical foundations of the PageRank formulation are examined, the acceleration of PageRank computing, in the effects of particular aspects of web graph structure on the optimal organization of computations, and in PageRank stability. Expand