Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

@article{Dhulipala2018TheoreticallyEP,
  title={Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable},
  author={Laxman Dhulipala and G. Blelloch and Julian Shun},
  journal={Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures},
  year={2018}
}
There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results… Expand
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
TLDR
It is shown that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. Expand
Provably Efficient and Scalable Shared-Memory Graph Algorithms
Parallel graph algorithms are important to a variety of computational disciplines today due to the widespread availability of large-scale graph-based data. Existing work that processes very largeExpand
Practical parallel hypergraph algorithms
TLDR
A collection of efficient parallel algorithms for hypergraph processing, including algorithms for betweenness centrality, maximal independent set, k-core decomposition, hypertrees, hyperpaths, connected components, PageRank, and single-source shortest paths are presented. Expand
Parallel graph algorithms in constant adaptive rounds
TLDR
The Adaptive Massively Parallel Computation (AMPC) model is focused on, which is a theoretical model that captures MapReduce-like computation augmented with a distributed hash table and can achieve improvements in both running time and round-complexity over optimized MPC baselines. Expand
Parallel algorithms for finding connected components using linear algebra
TLDR
A class of parallel connected-component algorithms designed using linear-algebraic primitives based on a PRAM algorithm by Shiloach and Vishkin are presented, designed using standard GraphBLAS operations and outperform previous algorithms by a significant margin. Expand
Fast Spectral Graph Layout on Multicore Platforms
TLDR
ParHDE is presented, a shared-memory parallelization of the High-Dimensional Embedding graph algorithm that can process graphs with billions of edges in minutes, is up to 18 × faster than a prior parallel implementation of HDE, and achieves up to a 24 × relative speedup on a 28-core system. Expand
Traversing large graphs on GPUs with unified memory
TLDR
A lightweight offline graph reordering algorithm, HALO (Harmonic Locality Ordering), is proposed that can be used as a pre-processing step for static graphs and specifically aims to cover large directed real world graphs in addition to undirected graphs whereas prior methods only account for the latter. Expand
Low-latency graph streaming using compressed purely-functional trees
TLDR
This paper designs theoretically-efficient and practical algorithms for performing batch updates to C-trees, and shows that it can store massive dynamic real-world graphs using only a few bytes per edge, thereby achieving space usage close to that of the best static graph processing frameworks. Expand
Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice
We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on theExpand
Terrace: A Hierarchical Graph Container for Skewed Dynamic Graphs
TLDR
Terrace is presented, a system for streaming graphs that uses a hierarchical data structure design to store a vertex's neighbors in different data structures depending on the degree of the vertex, enabling Terrace to dynamically partition vertices based on their degrees and adapt to skewness in the underlying graph. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 174 REFERENCES
Scalable parallel minimum spanning forest computation
TLDR
This paper proposes a novel, scalable, parallel MSF algorithm for undirected weighted graphs that leverages Prim's algorithm in a parallel fashion, concurrently expanding several subsets of the computed MSF. Expand
A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization
TLDR
A compact and efficient graph representation is developed, several graph analytics are implemented, and a number of optimizations that can be applied to these analytics are described. Expand
Fast In-Memory Triangle Listing for Large Real-World Graphs
TLDR
This paper proposes a fast and precise in-memory solution for the triangle listing problem, and proves how theoretic lower bound can be achieved by sorting the nodes in the graph by their degree and applying pruning. Expand
Ligra: a lightweight graph processing framework for shared memory
TLDR
This paper presents a lightweight graph processing framework that is specific for shared-memory parallel/multicore machines, which makes graph traversal algorithms easy to write and significantly more efficient than previously reported results using graph frameworks on machines with many more cores. Expand
Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2
TLDR
This paper presents fast parallel implementations of three fundamental graph theory problems, breadth-first search, st-connectivity and shortest paths for unweighted graphs, on multithreaded architectures such as the Cray MTA-2, and reports impressive results, both for algorithm execution time and parallel performance. Expand
BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems
TLDR
The Multistep method is introduced, a new approach that avoids work inefficiencies seen in prior SCC approaches and scales well on several real-world graphs, with performance fairly independent of topological properties such as the size of the largest SCC and the total number of SCCs. Expand
Multicore triangle computations without tuning
TLDR
This paper describes the design and implementation of simple and fast multicore parallel algorithms for exact, as well as approximate, triangle counting and other triangle computations that scale to billions of nodes and edges, and is much faster than existing parallel approximate triangle counting implementations. Expand
Pregel: a system for large-scale graph processing
TLDR
A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Expand
Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs
  • David A. Bader, Guojing Cong
  • Computer Science
  • 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
  • 2004
TLDR
Four parallel MST algorithms are designed and implemented for arbitrary sparse graphs that for the first time give speedup when compared with the best sequential algorithm, and also solve the minimum spanning forest problem. Expand
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
TLDR
This paper gives an algorithm for dynamic graph connectivity in this setting with constant communication rounds and communication cost almost linear in terms of the batch size, and illustrates the power of dynamic algorithms in the MPC model by showing that the batched version of the adaptive connectivity problem is $\mathsf{P}$-complete in the centralized setting, but sub-linear sized batches can be handled in a constant number of rounds. Expand
...
1
2
3
4
5
...