Fast distributed PageRank computation

@article{Sarma2015FastDP,
  title={Fast distributed PageRank computation},
  author={Atish Das Sarma and Anisur Rahaman Molla and Gopal Pandurangan and Eli Upfal},
  journal={Theor. Comput. Sci.},
  year={2015},
  volume={561},
  pages={113-121}
}
Over the last decade, PageRank has gained importance in a wide range of applications and domains, ever since it first proved to be effective in determining node importance in large graphs (and was a pioneering idea behind Google’s search engine). In distributed computing alone, PageRank vector, or more generally random walk based quantities have been used for several different applications ranging from determining important nodes, load balancing, search, and identifying connectivity structures… Expand
Improved Communication Cost in Distributed PageRank Computation - A Theoretical Study
  • S. Luo
  • Computer Science
  • ICML
  • 2020
TLDR
A new algorithm is provided that uses asymptotically the same communication round complexity while using only O(d log n) bits of bandwidth. Expand
Massively Parallel Algorithms for Personalized PageRank
TLDR
Delta-Push is an efficient framework for single-source and top-k PPR queries in distributed settings that reduces the number of rounds while guaranteeing that the load, i.e., the maximum number of messages an executor sends or receives in a round, can be bounded by the capacity of each executor. Expand
PageRank Computation via Web Aggregation in Distributed Randomized Algorithms
  • A. Suzuki, H. Ishii
  • Computer Science
  • 2019 IEEE 58th Conference on Decision and Control (CDC)
  • 2019
TLDR
This paper presents extensions of the distributed algorithms which were recently proposed for the computation of PageRank that are modified for aggregation-based computation by grouping pages in the same domain. Expand
FAST-PPR: scaling personalized pagerank estimation for large graphs
We propose a new algorithm, FAST-PPR, for computing personalized PageRank: given start node s and target node t in a directed graph, and given a threshold δ, it computes the Personalized PageRankExpand
Distributed Algorithms for Fully Personalized PageRank on Large Graphs
TLDR
A novel study on the computation of fully edge-weighted PPR on large graphs using the distributed computing framework that employs the Monte Carlo approximation that performs a large number of random walks from each node of the graph, and exploits the parallel pipeline framework to reduce the overall running time of the fully PPR. Expand
Efficient Algorithms for Personalized PageRank
TLDR
A new bidirectional algorithm which combines linear algebra and Monte Carlo to achieve significant speed improvements is presented, which is 70x faster than past state-of-the-art algorithms. Expand
Agenda: Robust Personalized PageRanks in Evolving Graphs
TLDR
Experiments on up to billion-edge scale graphs show that Agenda significantly outperforms state-of-the-art methods for various query/update workloads, while maintaining better or comparable approximation accuracies. Expand
On the Distributed Complexity of Large-Scale Graph Computations
TLDR
The General Lower Bound Theorem is established, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations, and two applications show that the approach can yield lower bounds for problems where the application of communication complexity techniques seems not obvious or gives weak bounds. Expand
Tight Bounds for Distributed Graph Computations
TLDR
The paper presents (almost) tight bounds for the round complexity of two fundamental graph problems, namely PageRank computation and triangle enumeration, and presents a distributed algorithm that enumerates all the triangles of a graph in $\tilde{O}(m/k^{5/3})$ rounds. Expand
On the Embeddability of Random Walk Distances
TLDR
This paper investigates methods to scalably and efficiently compute random-walk distances, by "embedding" graphs and distances into points and distances in geometric coordinate spaces, and proposes a new graph embedding system that explicitly accounts for per-node graph properties that affect random walk. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
Fast Distributed PageRank Computation
TLDR
This work designs provably efficient fully-distributed algorithms for computing PageRank using traditional matrix-vector multiplication style iterative methods, which may not always adapt well to the distributed setting owing to communication bandwidth restrictions and convergence rates. Expand
Fast Incremental and Personalized PageRank
TLDR
The overall result is that this algorithm is fast enough for real-time queries over a dynamic social network. Expand
Fast personalized PageRank on MapReduce
TLDR
It is shown that the number of MapReduce iterations used by the algorithm is optimal among a broad family of algorithms for the problem, and its I/O efficiency is much better than the existing candidates. Expand
Calculation of PageRank Over a Peer-To-Peer Network
Modern computer networks contain enormous quantities of unstructured data, and providing efficient methods of identifying interesting documents continues to be one of the largest challenges suchExpand
Distributed Random Walks
TLDR
A sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk and which is fully decentralized and can serve as a building block in the design of topologically-aware networks. Expand
Local Graph Partitioning using PageRank Vectors
TLDR
An improved algorithm for computing approximate PageRank vectors, which allows us to find a cut with conductance at most oslash and approximately optimal balance in time O(m log4 m/oslash) in time proportional to its size. Expand
Distributed pagerank for P2P systems
TLDR
This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems, based on chaotic (asynchronous) iterative solution of linear systems, which provided approximately a ten-fold reduction in network traffic for two-word and three-word querying. Expand
Efficient distributed random walks with applications
TLDR
A fast sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk and which can serve as a building block in the design of topologically-aware networks is presented. Expand
Estimating PageRank on graph streams
TLDR
In the streaming model, this article shows how to perform several graph computations including estimating the probability distribution after a random walk of length l, the mixing time, and other related quantities such as the conductance of the graph. Expand
A Survey on PageRank Computing
TLDR
The theoretical foundations of the PageRank formulation are examined, the acceleration of PageRank computing, in the effects of particular aspects of web graph structure on the optimal organization of computations, and in PageRank stability. Expand
...
1
2
3
...