An Adaptive Parallel Algorithm for Computing Connected Components

@article{Jain2017AnAP,
  title={An Adaptive Parallel Algorithm for Computing Connected Components},
  author={Chirag Jain and Patrick Flick and Tony Pan and Oded Green and Srinivas Aluru},
  journal={IEEE Transactions on Parallel and Distributed Systems},
  year={2017},
  volume={28},
  pages={2428-2439}
}
We present an efficient distributed memory parallel algorithm for computing connected components in undirected graphs based on Shiloach-Vishkin’s PRAM approach. We discuss multiple optimization techniques that reduce communication volume as well as load-balance the algorithm. We also note that the efficiency of the parallel graph connectivity algorithm depends on the underlying graph topology. Particularly for short diameter graph components, we observe that parallel Breadth First Search (BFS… 

FastSV: A Distributed-Memory Connected Component Algorithm with Fast Convergence

TLDR
The algorithm simplifies the classic Shiloach-Vishkin algorithm and employs several novel and efficient hooking strategies for faster convergence and map different steps of FastSV to linear algebraic operations and implement them with the help of scalable graph libraries.

ConnectIt: A Framework for Static and Incremental Parallel Graph Connectivity Algorithms

TLDR
The ConnectIt framework is designed, which provides different sampling strategies as well as various tree linking and compression schemes, and is able to compute connectivity on the largest publicly-available graph in under 10 seconds using a 72-core machine.

LACC: A Linear-Algebraic Algorithm for Finding Connected Components in Distributed Memory

  • A. AzadA. Buluç
  • Computer Science
    2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
  • 2019
TLDR
This paper presents a parallel connected-components algorithm that can run on distributed-memory computers and uses linear algebraic primitives and is based on a PRAM algorithm by Awerbuch and Shiloach, which outperforms previous algorithms by a significant margin.

Distributed Algorithms for Connectivity and MST in Large Graphs with Efficient Local Computation

TLDR
A well-studied flooding algorithm for connectivity and connected components that takes rounds and local computation time and presents two deterministic algorithms which are increasingly sophisticated implementations of the classical Borůvka’s algorithm, the last of which has round complexity andLocal computation complexity.

Parallel algorithms for finding connected components using linear algebra

VPC: Pruning connected components using vector-based path compression for Graph500

TLDR
Experimental results validate that the two-dimensional adjacency vector has better performance than other data structures and the vector-based path compression is used in the realization of the union-find algorithm.

Parallel and Scalable Combinatorial String and Graph Algorithms on Distributed Memory Systems

TLDR
This work presents its distributed-memory parallel algorithms for indexing large genomic datasets, including algorithms for construction of suffix arrays and LCP arrays, solving the All-Nearest-SmallerValues problem and its application to theConstruction of suffix trees.

Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs

TLDR
This paper explores various design choices in GPU connectivity algorithms, including sampling, linking, and tree compression, for both the static as well as the incremental setting, and leads to over 300 new GPU implementations of connectivity, many of which outperform state-of-the-art.

Synchronization-Avoiding Graph Algorithms

TLDR
It is demonstrated that eliminating synchronization in conjunction with effective scheduling policies and optimizations in the runtime results in improved scalability for both classes of algorithms, and novel, synchronization-avoiding algorithms are developed.

Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs

TLDR
The implications of the skewed degree distribution of real-world graphs on their connectivity are investigated and these features are used to introduce Thrifty Label Propagation as a structure-aware CC algorithm obtained by incorporating 4 fundamental optimization techniques in the Labelpropagation CC algorithm.

References

SHOWING 1-10 OF 56 REFERENCES

BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems

TLDR
The Multistep method is introduced, a new approach that avoids work inefficiencies seen in prior SCC approaches and scales well on several real-world graphs, with performance fairly independent of topological properties such as the size of the largest SCC and the total number of SCCs.

A fast, parallel spanning tree algorithm for symmetric multiprocessors

  • David A. BaderGuojing Cong
  • Computer Science
    18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
  • 2004
TLDR
A new randomized algorithm and implementation with superior performance that for the first-time achieves parallel speedup on arbitrary graphs (both regular and irregular topologies) when compared with the best sequential implementation for finding a spanning tree.

A Parallel Algorithm for Connected Components on Distributed Memory Machines

TLDR
This work has designed and implemented a CC algorithm in C++ and MPI, by combining the ideas of the previous PRAM and distributed memory algorithms, and implementing a method for reducing the number of exchanged messages which is based on buffering messages and on deferred processing of answers.

A simple and practical linear-work parallel algorithm for connectivity

TLDR
This work describes a simple and practical expected linear-work, polylogarithmic depth depth parallel algorithm for graph connectivity and is the first parallel connectivity algorithm that is both theoretically and practically efficient.

Highly scalable graph search for the Graph500 benchmark

TLDR
An optimized method based on 2D partitioning and other methods such as communication compression and vertex sorting is devised to handle BFS (Breadth First Search) of a large graph with 236 and 240 edges in 10.58 seconds, which corresponds to 103.9 GE/s.

A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization

TLDR
A compact and efficient graph representation is developed, several graph analytics are implemented, and a number of optimizations that can be applied to these analytics are described.

Parallel breadth-first search on distributed memory systems

  • A. BuluçKamesh Madduri
  • Computer Science
    2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)
  • 2011
TLDR
This work presents two highly-tuned parallel approaches for Breadth-First Search on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.

An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected Real-World Graphs

TLDR
The high accuracy, efficiency, and parallelism of the best implementation of the graph eccentricity estimation algorithms allows the fast generation of eccentricity estimates for large graphs, which are useful in many applications arising in large-scale network analysis.

PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs

TLDR
This paper describes the challenges of computation on natural graphs in the context of existing graph-parallel abstractions and introduces the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges.
...