(Semi-)External Algorithms for Graph Partitioning and Clustering

  title={(Semi-)External Algorithms for Graph Partitioning and Clustering},
  author={Yaroslav Akhremtsev and Peter Sanders and Christian Schulz},
In this paper, we develop semi-external and external memory algorithms for graph partitioning and clustering problems. Graph partitioning and clustering are key tools for processing and analyzing large complex networks. We address both problems in the (semi-)external model by adapting the size-constrained label propagation technique. Our (semi-)external size-constrained label propagation algorithm can be used to compute graph clusterings and is a prerequisite for the (semi-)external graph… 

Figures and Tables from this paper

Parallel and External High Quality Graph Partitioning

First, this work presents an approach to shared-memory parallel multi-level graph partitioning that guarantees balanced solutions, shows high speed-ups for a variety of large graphs and yields very good quality independently of the number of cores used.

Parallel Graph Partitioning for Complex Networks

This work parallelizes and adapts the label propagation technique originally developed for graph clustering and becomes applicable for both the coarsening and the refinement phase of multilevel graph partitioning, and obtains very high quality by applying a highly parallel evolutionary algorithm to the coARSest graph.

Label Propagation for Hypergraph Partitioning

This thesis investigates the adaptation of label propagation, a graph clustering algorithm, to hypergraph partitioning and proposes three adaptations oflabel propagation which are motivated by graph-based hypergraph modeling and evaluates them as coarsening strategies in a direct k-way multilevel hyper- graph partitioning framework.

GraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine

This paper proposes GraphMP, a vertex-centric sliding window computation model to avoid reading and writing vertices on disk, and uses a compressed edge cache mechanism to fully utilize the available memory of a machine to reduce the amount of disk accesses for edges.

Practical Minimum Cut Algorithms

This work introduces a linear-time algorithm to compute near-minimum cuts based on cluster contraction using label propagation and Padberg and Rinaldi’s contraction heuristics and achieves a lower running time and better parallel scalability at the expense of a higher error rate.

Scalable Graph Algorithms

This habilitation thesis is a summary a broad spectrum of scalable graph algorithms that I developed over the last six years with many coauthors based on four pillars: multilevel algorithms, practical kernelization, parallelization and memetic algorithms that are highly interconnected.

I/O-Efficient Generation of Massive Graphs Following the LFR Benchmark

EM-LFR is presented, the first external memory algorithm able to generate massive complex networks following the LFR benchmark and evidence that both implementations yield graphs with matching properties by applying clustering algorithms to generated instances is given.

Graph Clustering using MapReduce

Detecting community structures in networks is an important problem in graph analytics. With the recent BigData trends network sizes are growing tremendously. Oftentimes the networks are now too big

A Critical Survey of the Multilevel Method in Complex Networks

An extensive survey of the literature is presented, presenting a systematic overview of the state-of-the-art, a panorama of the historical evolution and current challenges, and a formal theoretical framework of the multilevel optimization method in complex networks.

Hardware Locality-Aware Partitioning and Dynamic Load-Balancing of Unstructured Meshes for Large-Scale Scientific Applications

An open-source topology-aware hierarchical unstructured mesh partitioning and load-balancing tool TreePart, successfully integrated into the authors' in-house code and results from a large-eddy simulation of a combustion problem are presented.



Partitioning Complex Networks via Size-Constrained Clustering

This paper describes a novel approach to partition graphs effectively especially if the networks have a highly irregular structure that provides graph coarsening by iteratively contracting size-constrained clusterings that are computed using a label propagation algorithm.

Streaming graph partitioning for large distributed graphs

This work proposes natural, simple heuristics for graph partitioning and compares their performance to hashing and METIS, a fast, offline heuristic, and shows on a large collection of graph datasets that they are a significant improvement.

A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

This work presents a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of theSize of the final partition obtained after multilevel refinement, and presents a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening.

A Functional Approach to External Graph Algorithms

AbstractWe present a new approach for designing external graph algorithms and use it to design simple, deterministic and randomized external algorithms for computing connected components, minimum

GraphChi: Large-Scale Graph Computation on Just a PC

This work presents GraphChi, a disk-based system for computing efficiently on graphs with billions of edges, and builds on the basis of Parallel Sliding Windows to propose a new data structure Partitioned Adjacency Lists, which is used to design an online graph database graphChi-DB.

Multilevel algorithms for partitioning power-law graphs

New clustering-based coarsening schemes that identify and collapse together groups of vertices that are highly connected are presented that consistently and significantly outperform existing state-of-the-art approaches for graph partitioning.

Engineering a scalable high quality graph partitioner

An approach to parallel graph partitioning that scales to hundreds of processors and produces a high solution quality is described, including a parallelization of the FM local search algorithm that works more locally than previous approaches.

Graph partitioning with the Party library: helpful-sets in practice

  • B. MonienS. Schamberger
  • Computer Science
    16th Symposium on Computer Architecture and High Performance Computing
  • 2004
Graph partitioning is an important subproblem in many applications. To partition a graph into more than two parts, there exist two different commonly used approaches: Either the graph is partitioned

Near linear time algorithm to detect community structures in large-scale networks.

This paper investigates a simple label propagation algorithm that uses the network structure alone as its guide and requires neither optimization of a predefined objective function nor prior information about the communities.

Finding Good Approximate Vertex and Edge Partitions is NP-Hard