Providing Serializability for Pregel-like Graph Processing Systems

  title={Providing Serializability for Pregel-like Graph Processing Systems},
  author={Minyang Han and Khuzaima S. Daudjee},
There is considerable interest in the design and development of distributed systems that can execute algorithms to process large graphs. Serializability guarantees that parallel executions of a graph algorithm produce the same results as some serial execution of that algorithm. Serializability is required by many graph algorithms for accuracy, correctness, or termination but existing graph processing systems either do not provide serializability or cannot provide it eciently. To address this… Expand
Practice of Streaming and Dynamic Graphs: Concepts, Models, Systems, and Parallelism
This work provides the first analysis and taxonomy of dynamic and streaming graph processing, focusing on identifying the fundamental system designs and on understanding their support for concurrency and parallelism, and for different graph updates as well as analytics workloads. Expand
Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems.
This work provides the first analysis and taxonomy of dynamic and streaming graph processing, focusing on identifying the fundamental system designs and on understanding their support for concurrency, and for different graph updates as well as analytics workloads. Expand
Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries
This work presents the first survey and taxonomy of graph database systems, identifying and analyzing fundamental categories of these systems, and outlines graph database queries and relationships with associated domains (NoSQL stores, graph streaming, and dynamic graph algorithms). Expand


Giraphx: Parallel Yet Serializable Large-Scale Graph Processing
The modified framework, Giraphx, provides much better performance than implementing the application using dining-philosophers over Giraph even for embarrassingly parallel applications that do not require coordination, e.g., PageRank. Expand
Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems
The results demonstrate that the BAP model provides efficient and transparent asynchronous execution of algorithms that are programmed synchronously, and provides across-the-board performance improvements of up to 5× faster over synchronous systems and up to an order of magnitude faster than asynchronous systems. Expand
Asynchronous Large-Scale Graph Processing Made Easy
GRACE is designed, a new graph programming platform that separates application logic from execution policies and contains a carefully designed and implemented parallel execution engine for both synchronous and user-specified built-in asynchronous execution policies. Expand
Pregel: a system for large-scale graph processing
A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Expand
Mizan: a system for dynamic load balancing in large-scale graph processing
Mizan is introduced, a Pregel system that achieves efficient load balancing to better adapt to changes in computing needs and does not assume any a priori knowledge of the structure of the graph or behavior of the algorithm. Expand
GPS: a graph processing system
This paper describes the implementation of GPS and its novel features, and presents experimental results on the performance effects of both static and dynamic graph partitioning schemes, and describes the compilation of a high-level domain-specific programming language to GPS, enabling easy expression of complex algorithms. Expand
An Experimental Comparison of Pregel-like Graph Processing Systems
A study to experimentally compare Giraph, GPS, Mizan, and Graphlab on equal ground by considering graph and algorithm agnostic optimizations and by using several metrics finds that the system optimizations present in Giraph and GraphLab allow them to perform well. Expand
GraphX: a resilient distributed graph system on Spark
GraphX is introduced, which combines the advantages of both data-parallel and graph-par parallel systems by efficiently expressing graph computation within the Spark data- parallel framework and provides powerful new operations to simplify graph construction and transformation. Expand
PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs
This paper describes the challenges of computation on natural graphs in the context of existing graph-parallel abstractions and introduces the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges. Expand
GraphChi: Large-Scale Graph Computation on Just a PC
This work presents GraphChi, a disk-based system for computing efficiently on graphs with billions of edges, and builds on the basis of Parallel Sliding Windows to propose a new data structure Partitioned Adjacency Lists, which is used to design an online graph database graphChi-DB. Expand