Communication-Efficient Parallel Multiway and Approximate Minimum Cut Computation

  title={Communication-Efficient Parallel Multiway and Approximate Minimum Cut Computation},
  author={Friedhelm Meyer auf der Heide and Gabriel Ter{\'a}n Martinez},
We examine different variants of minimum cut problems on undirected weighted graphs on the p-processor bulk synchronous parallel (BSP) model of Valiant. This model and the corresponding cost measure guide algorithm designers to develop work efficient algorithms that need only very little communication. Karger and Stein have presented a recursive contraction algorithm to solve minimum cut problems. They suggest a PRAM implementation of their algorithm working in polynomial polylogarithmic time… 

Communication-avoiding parallel minimum cuts and connected components

Novel scalable parallel algorithms for finding global minimum cuts and connected components, which are important and fundamental problems in graph processing, and an approximate variant of the minimum cut algorithm, which approximates the exact solutions well while using a fractions of cores in a fraction of time are provided.

Congested Clique Algorithms for the Minimum Cut Problem

An algorithm that can solve simultaneously polynomially many instances of the MST problem in O(1) rounds and is based on Karger's state of the art sequential exact min-cut algorithm, which works via tree-packing.



Randomized fully-scalable BSP techniques for multi-searching and convex hull construction

This work addresses two fundamental problems: multi-searching and convex hull construction and results in algorithms that use internal time that is O(F) and, for h = Q(n/p), a number of communication rounds that is 0( i,$E;i,), with high probability.

Communication-efficient parallel sorting (preliminary version)

This work provides parallel sorting methods that use internal computation time that is O(*) and a number of communication rounds that is 0( ~$$~1) ) for h = @(n/p) and shows that the internal computation bound is optimal for any comparison-based sorting algorithm.

Global min-cuts in RNC, and other ramifications of a simple min-out algorithm

This algorithm provides the first proof that the min-cut problem for weighted undirected graphs is in 7ZAfC, and does more than find a single mm-cut; it finds all of them.

Parallel sorting with limited bandwidth

The lower bounds provide further convincing evidence that efficient parallel algorithms for sorting rely strongly on high communication bandwidth and can be adapted to bridging models that address the issue of limited communication bandwidth.

Efficient Parallel Graph Algorithms For Coarse Grained Multicomputers and BSP

The algorithms presented are the first practically relevant deterministic parallel algorithms for these problems to be used for commercially available coarse grained parallel machines and view as an important step towards the final goal of O(1) communication rounds.

Trade-offs between communication throughput and parallel time

The results suggest that new alternative methodologies that need a lower such level must be invented for parallel machines that enable a low level of communication throughput, since otherwise those machines will be severly handicapped as general purpose parallel machines.

Direct Bulk-Synchronous Parallel Algorithms

It is shown that optimality to within a multiplicative factor close to one can be achieved for the problems of Gauss-Jordan elimination and sorting, by transportable algorithms that can be applied for a wide range of values of the parameters p, g, and L.

A new approach to the minimum cut problem

A randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability with a significant improvement over the previous time bounds based on maximum flows.

Random sampling in graph optimization problems

The general technique is to generate small random representative subproblems and solve them in lieu of the original ones, producing approximately correct answers which may then be refined to correct ones at little additional cost.

Towards an architecture-independent analysis of parallel algorithms

A simple and efficient method for evaluating the performance of an algorithm, rendered as a directed acyclic graph, on any parallel computer is presented and its application to several common algorithms shows that it is surprisingly accurate.