• Corpus ID: 46408096

On-The-Fly parallel decomposition of strongly connected components

@inproceedings{Bloemen2015OnTheFlyPD,
  title={On-The-Fly parallel decomposition of strongly connected components},
  author={Vincent Bloemen},
  year={2015}
}
Several algorithms exist for decomposing strongly connected components (SCCs). To accommodate recent non-reversible trends in hardware, we focus on utilizing multi-core architectures. Specifically, we consider parallelizing SCC algorithms in the setting of an on-the-fly implementation (to be able to detect SCCs while constructing the graph - which is particularly useful for several verification techniques). We show that the current solutions are not capable of scaling efficiently and we propose… 

Multi-core on-the-fly SCC decomposition

This paper presents a novel parallel, on-the-fly SCC algorithm that preserves the linear-time property by letting workers explore the graph randomly while carefully communicating partially completed SCCs, and develops a concurrent, iterable disjoint set structure.

Efficient Parallel Graph Trimming by Arc-Consistency

This work parallelize the AC-4-based and AC-6-based trimming algorithms to be suitable for shared-memory multi-core machines and proves the correctness and analyze time complexities with the work-depth model.

An Efficient Implementation of the Transitive Closure Problem on Intel KNL Architecture

An optimized algorithm implementation for the transitive closure problem solution has been developed and the proposed implementation has been studied using different approaches, aimed at demonstrating advantages and disadvantages of Intel KNL architecture in solving graph-processing problems.

Randomized Concurrent Set Union and Generalized Wake-Up

This work designs a randomized algorithm that performs at most O(log n) work per operation, and designs a class of "symmetric algorithms'' that captures the complexities of all the known algorithms for the disjoint set union problem, and proves that the algorithm has optimal total work complexity for this class.

A Randomized Concurrent Algorithm for Disjoint Set Union

This work extends a known efficient sequential algorithm for joint set union to obtain a simple and efficient concurrent wait-free algorithm running on an asynchronous parallel random access machine (APRAM).

Variations on parallel explicit emptiness checks for generalized Büchi automata

These new parallel explicit emptiness checks for LTL model checking are based on a strongly connected component (SCC) enumeration and support generalized Büchi acceptance, and require no synchronization points or recomputing procedures.

Parallel model checking of ω-automata

This research focuses on designing and improving parallel graph searching algorithms for emptiness checking on various types of ω-automata and developed a scalable multi-core on-the-fly algorithm for the detection of strongly connected components (SCCs).

Monte Carlo Tree Search With Reversibility Compression

  • Michael Cook
  • Computer Science
    2021 IEEE Conference on Games (CoG)
  • 2021
MCTS with Reversibility Compression is introduced, which uses the notion of action reversibility to compress MCTS trees as they are constructed, without loss of information, which improves search by preventing the duplication of already-explored states, and increasing the attention paid to significant actions.

Concurrent disjoint set union

It is proved that for a class of symmetric algorithms that includes the authors' DCAS and randomized algorithms, no better step or work bound is possible, making their algorithms truly scalable.

References

SHOWING 1-10 OF 64 REFERENCES

BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems

The Multistep method is introduced, a new approach that avoids work inefficiencies seen in prior SCC approaches and scales well on several real-world graphs, with performance fairly independent of topological properties such as the size of the largest SCC and the total number of SCCs.

Improved Distributed Algorithms for SCC Decomposition

Computing Strongly Connected Components in Parallel on CUDA

This paper designs a new CUDA-aware procedure for pivot selection and adapt selected parallel algorithms for CUDA accelerated computation and experimentally demonstrates that with a single GTX 480 GPU card, this paper can easily outperform the optimal serial CPU implementation by an order of magnitude.

On fast parallel detection of strongly connected components (SCC) in small-world graphs

This paper investigates the shortcomings of the conventional approach in parallel SCC detection and proposes a series of extensions that consider the fundamental properties of real-world graphs, e.g. the small-world property.

Efficient decomposition of strongly connected components on GPUs

Distributed Algorithms for SCC Decomposition

It is shown that it is possible to perform SCC decomposition in parallel efficiently and that OBFR, if properly implemented, is the best choice in most cases.

Multi-core Model Checking Algorithms for LTL Verification with Fairness Assumptions

This work proposes two new parallel algorithms based on strongly connected component (SCC) searching algorithm (i.e., Tarjan's algorithm) that can not only check LTL properties, but also handle fairness assumptions all together.

GPU-Based Graph Decomposition into Strongly Connected and Maximal End Components

This paper presents parallel algorithms for component decomposition of graph structures on General Purpose Graphics Processing Units (GPUs). In particular, we consider the problem of decomposing

Wait-free parallel algorithms for the union-find problem

This paper gives a wait-free implementation of an efficient algorithm for union-find and shows that the worst case performance of the algorithm can be improved by simulating a synchronized algorithm, or bysimulating a larger machine if the data structure requests support sufficient parallelism.

Pregel: a system for large-scale graph processing

A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
...