• Corpus ID: 211678303

Recent Advances in Scalable Network Generation

@article{Penschuck2020RecentAI,
  title={Recent Advances in Scalable Network Generation},
  author={Manuel Penschuck and Ulrik Brandes and Michael Hamann and Sebastian Lamm and Ulrich Meyer and Ilya Safro and Peter Sanders and Christian Schulz},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.00736}
}
Random graph models are frequently used as a controllable and versatile data source for experimental campaigns in various research fields. Generating such data-sets at scale is a non-trivial task as it requires design decisions typically spanning multiple areas of expertise. Challenges begin with the identification of relevant domain-specific network features, continue with the question of how to compile such features into a tractable model, and culminate in algorithmic details arising while… 
Fast GPU-Based Generation of Large Graph Networks From Degree Distributions
TLDR
This work presents the design, implementation, and performance study of a novel network generator that can produce very large graph networks conforming to any desired degree distribution and provides a coarsening method that increases the GPU-based generation speed by up to a factor of 4.
Buffered Streaming Graph Partitioning
TLDR
This work develops a multilevel algorithm that optimizes an objective function that has previously shown to be effective for the streaming setting and removes the dependency on the number of blocks k from the running time compared to the previous state-of-the-art.
Artificial Benchmark for Community Detection (ABCD)—Fast random graph model with community structure
TLDR
An alternative random graph model with community structure and power law distribution for both degrees and community sizes, the Artificial Benchmark for Community Detection (ABCD graph) is provided, and its main parameter ξ can be tuned to mimic its counterpart in the LFR model, the mixing parameter μ.
Engineering Uniform Sampling of Graphs with a Prescribed Power-law Degree Sequence
TLDR
This work sketches Inc-Powerlaw, a novel and much more involved algorithm capable of generating graphs for power-law bounded degree sequences with γ ' 2.88 in expected linear time and is the first practical generator with rigorous uniformity guarantees for the aforementioned degree sequences.
An Experimental Study of External Memory Algorithms for Connected Components
TLDR
Whether the randomized O(Sort(E)) algorithm by Karger, Klein, and Tarjan can be implemented to compete with practically promising and simpler algorithms having only slightly worse theoretical cost, namely Borůvka’s algorithm and the algorithm by Sibeyn and collaborators is studied.
Linear work generation of R-MAT graphs
TLDR
This work achieves constant time per edge by precomputing pieces of node IDs of logarithmic length using an alias table data structure, which further pushes the limits of attainable graph size and makes generation overhead negligible in most situations.
fastball: A fast algorithm to sample bipartite graphs with fixed degree sequences
TLDR
It is shown that fastball randomly samples large bipartite graphs with fixed degrees more than four times faster than curveball, and the value of this faster algorithm in the context of the flxed degree sequence model for backbone extraction is illustrated.
Weighted Random Sampling – Alias Tables on the GPU
TLDR
This thesis develops a construction algorithm for GPUs that is based on the idea of PSA and achieves a speedup of 34 on a GPU in comparison to the 4-threaded PSA method on a consumer grade CPU.
Communication-Efficient Probabilistic Algorithms: Selection, Sampling, and Checking
Diese Dissertation behandelt drei grundlegende Klassen von Problemen in Big-Data-Systemen, fur die wir kommunikationseffiziente probabilistische Algorithmen entwickeln. Im ersten Teil betrachten wir
Parallel Global Edge Switching for the Uniform Sampling of Simple Graphs with Prescribed Degrees
TLDR
This work engineer's a simple sequential ES-MC implementation representing the graph in a hash-set and proposes the Global Edge Switching Markov Chain (G-ES-MC), which provides empirical evidence that G-Es-MC requires not more switches than ES- MC (and often fewer) and empirical evidence of the scalability of the implementations.

References

SHOWING 1-10 OF 173 REFERENCES
An Efficient and Scalable Algorithmic Method for Generating Large-Scale Random Graphs
TLDR
This paper presents a novel time and space efficient algorithmic method to generate random graphs using CL, BTER, and SBM models and shows how this method leads to efficient parallel and sequential algorithms for the SBM and BTER models.
Communication-Free Massively Distributed Graph Generation
TLDR
This work presents novel generators for a variety of network models commonly found in practice by making use of pseudorandomization and divide-and-conquer schemes, which follow a communication-free paradigm and allow new graph families to be used on an unprecedented scale.
A Scalable Generative Graph Model with Community Structure
TLDR
It is proposed that the proposed Block Two-Level Erdss-Renyi (BTER) model can be used as a graph generator for benchmarking purposes and provide idealized degree distributions and clustering coefficient profiles that can be tuned for user specifications.
Fast random graph generation
TLDR
PPreZER is proposed, an alternative, data parallel algorithm for random graph generation under the Erdős-Rényi model, designed and implemented in a graphics processing unit (GPU), led to this chief contribution of the authors' via a succession of seven intermediary algorithms, both sequential and parallel.
Scalable and exact sampling method for probabilistic generative graph models
TLDR
This work extends the algorithm proposed in Moreno et al. (in: IEEE 14th international conference on data mining, pp 440–449, 2014) for a single model and develops a general solution for a broad class of PGGMs, and concludes by sampling a network with over a billion edges in 95 s on a single processor.
Multiscale planar graph generation
TLDR
A flexible algorithm that can synthesize realistic networks that are planar, which preserves the structural properties with minimal bias including the planarity of the network, while introducing realistic variability at multiple scales is presented.
Generating Synthetic Social Graphs with Darwini
TLDR
Darwini is proposed, a graph generator that captures a number of core characteristics of real graphs and can reproduce the degree distribution and, unlike existing approaches, the local clustering coefficient distribution.
Generating massive complex networks with hyperbolic geometry faster in practice
TLDR
This paper presents a fast generation algorithm for random hyperbolic graphs, and presents a dynamic extension to model gradual network change, while preserving at each step the point position probabilities.
Scalable generation of graphs for benchmarking HPC community-detection algorithms
TLDR
This work provides an alternative, based on the scalable Block Two-Level Erdos-Renyi (BTER) graph generator, that enables HPC-scale evaluation of solution quality in the style of LFR, and demonstrates the capability by showing that label-propagation community-detection algorithm can be strong-scaled with negligible solution-quality loss.
Systematic topology analysis and generation using degree correlations
TLDR
This work presents a new, systematic approach for analyzing network topologies, introducing the dK-series of probability distributions specifying all degree correlations within d-sized subgraphs of a given graph G, and demonstrates that these graphs reproduce, with increasing accuracy, important properties of measured and modeled Internet topologies.
...
...