# Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

@article{Dhulipala2018TheoreticallyEP, title={Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable}, author={Laxman Dhulipala and G. Blelloch and Julian Shun}, journal={Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures}, year={2018} }

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results… Expand

#### Figures, Tables, and Topics from this paper

#### 69 Citations

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

- Computer Science
- ACM Trans. Parallel Comput.
- 2021

It is shown that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. Expand

Provably Efficient and Scalable Shared-Memory Graph Algorithms

- 2019

Parallel graph algorithms are important to a variety of computational disciplines today due to the widespread availability of large-scale graph-based data. Existing work that processes very large… Expand

Practical parallel hypergraph algorithms

- Computer Science
- PPoPP
- 2020

A collection of efficient parallel algorithms for hypergraph processing, including algorithms for betweenness centrality, maximal independent set, k-core decomposition, hypertrees, hyperpaths, connected components, PageRank, and single-source shortest paths are presented. Expand

Parallel graph algorithms in constant adaptive rounds

- Computer Science
- Proc. VLDB Endow.
- 2020

The Adaptive Massively Parallel Computation (AMPC) model is focused on, which is a theoretical model that captures MapReduce-like computation augmented with a distributed hash table and can achieve improvements in both running time and round-complexity over optimized MPC baselines. Expand

Parallel algorithms for finding connected components using linear algebra

- Computer Science
- J. Parallel Distributed Comput.
- 2020

A class of parallel connected-component algorithms designed using linear-algebraic primitives based on a PRAM algorithm by Shiloach and Vishkin are presented, designed using standard GraphBLAS operations and outperform previous algorithms by a significant margin. Expand

Fast Spectral Graph Layout on Multicore Platforms

- Computer Science
- ICPP
- 2020

ParHDE is presented, a shared-memory parallelization of the High-Dimensional Embedding graph algorithm that can process graphs with billions of edges in minutes, is up to 18 × faster than a prior parallel implementation of HDE, and achieves up to a 24 × relative speedup on a 28-core system. Expand

Traversing large graphs on GPUs with unified memory

- Computer Science
- Proc. VLDB Endow.
- 2020

A lightweight offline graph reordering algorithm, HALO (Harmonic Locality Ordering), is proposed that can be used as a pre-processing step for static graphs and specifically aims to cover large directed real world graphs in addition to undirected graphs whereas prior methods only account for the latter. Expand

Low-latency graph streaming using compressed purely-functional trees

- Computer Science
- PLDI
- 2019

This paper designs theoretically-efficient and practical algorithms for performing batch updates to C-trees, and shows that it can store massive dynamic real-world graphs using only a few bytes per edge, thereby achieving space usage close to that of the best static graph processing frameworks. Expand

Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice

- Computer Science
- 2020

We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the… Expand

Terrace: A Hierarchical Graph Container for Skewed Dynamic Graphs

- Computer Science
- SIGMOD Conference
- 2021

Terrace is presented, a system for streaming graphs that uses a hierarchical data structure design to store a vertex's neighbors in different data structures depending on the degree of the vertex, enabling Terrace to dynamically partition vertices based on their degrees and adapt to skewness in the underlying graph. Expand

#### References

SHOWING 1-10 OF 174 REFERENCES

Scalable parallel minimum spanning forest computation

- Computer Science
- PPoPP '12
- 2012

This paper proposes a novel, scalable, parallel MSF algorithm for undirected weighted graphs that leverages Prim's algorithm in a parallel fashion, concurrently expanding several subsets of the computed MSF. Expand

A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization

- Computer Science
- 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- 2016

A compact and efficient graph representation is developed, several graph analytics are implemented, and a number of optimizations that can be applied to these analytics are described. Expand

Fast In-Memory Triangle Listing for Large Real-World Graphs

- Mathematics, Computer Science
- SNAKDD'14
- 2014

This paper proposes a fast and precise in-memory solution for the triangle listing problem, and proves how theoretic lower bound can be achieved by sorting the nodes in the graph by their degree and applying pruning. Expand

Ligra: a lightweight graph processing framework for shared memory

- Computer Science
- PPoPP '13
- 2013

This paper presents a lightweight graph processing framework that is specific for shared-memory parallel/multicore machines, which makes graph traversal algorithms easy to write and significantly more efficient than previously reported results using graph frameworks on machines with many more cores. Expand

Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2

- Computer Science
- 2006 International Conference on Parallel Processing (ICPP'06)
- 2006

This paper presents fast parallel implementations of three fundamental graph theory problems, breadth-first search, st-connectivity and shortest paths for unweighted graphs, on multithreaded architectures such as the Cray MTA-2, and reports impressive results, both for algorithm execution time and parallel performance. Expand

BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems

- Computer Science
- 2014 IEEE 28th International Parallel and Distributed Processing Symposium
- 2014

The Multistep method is introduced, a new approach that avoids work inefficiencies seen in prior SCC approaches and scales well on several real-world graphs, with performance fairly independent of topological properties such as the size of the largest SCC and the total number of SCCs. Expand

Multicore triangle computations without tuning

- Computer Science
- 2015 IEEE 31st International Conference on Data Engineering
- 2015

This paper describes the design and implementation of simple and fast multicore parallel algorithms for exact, as well as approximate, triangle counting and other triangle computations that scale to billions of nodes and edges, and is much faster than existing parallel approximate triangle counting implementations. Expand

Pregel: a system for large-scale graph processing

- Computer Science
- SIGMOD Conference
- 2010

A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Expand

Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs

- Computer Science
- 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
- 2004

Four parallel MST algorithms are designed and implemented for arbitrary sparse graphs that for the first time give speedup when compared with the best sequential algorithm, and also solve the minimum spanning forest problem. Expand

Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds

- Computer Science
- SODA
- 2020

This paper gives an algorithm for dynamic graph connectivity in this setting with constant communication rounds and communication cost almost linear in terms of the batch size, and illustrates the power of dynamic algorithms in the MPC model by showing that the batched version of the adaptive connectivity problem is $\mathsf{P}$-complete in the centralized setting, but sub-linear sized batches can be handled in a constant number of rounds. Expand