Large Scale Graph Processing in a Distributed Environment

@inproceedings{Upadhyay2017LargeSG,
  title={Large Scale Graph Processing in a Distributed Environment},
  author={Nitesh Upadhyay and Parita Patel and Unnikrishnan Cheramangalath and Y. N. Srikant},
  booktitle={Euro-Par Workshops},
  year={2017}
}
Large graphs are widely used in real world graph analytics. Memory available in a single machine is usually inadequate to process these graphs. A good solution is to use a distributed environment. Typical programming styles used in existing distributed environment frameworks are different from imperative programming and difficult for programmers to adapt. Moreover, some graph algorithms having a high degree of parallelism ideally run on an accelerator cluster. Error prone and lower level… 
Abelian: A Compiler for Graph Analytics on Distributed, Heterogeneous Platforms
TLDR
A compiler called Abelian is implemented that translates shared-memory descriptions of graph algorithms written in the Galois programming model into efficient code for distributed-memory platforms with heterogeneous processors, demonstrating that Abelian can manage heterogeneity and distributed- memory successfully while generating high-performance code.
Distributed Graph Analytics
TLDR
How language abstractions and good compilation can ease programming graph analytics on distributed systems with CPU, GPU, and multi-GPU machines without sacrificing implementation efficiency is emphasized.
Custom code generation for a graph DSL
TLDR
This work presents challenges faced in making a domain-specific language (DSL) for graph algorithms adapt to varying requirements of generating a spectrum of efficient parallel codes, and narrates the experiences in making an existing DSL, named Falcon, adaptive, adaptive.

References

SHOWING 1-10 OF 18 REFERENCES
Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems
TLDR
This work designs and develops TOTEM - a processing engine that provides a convenient environment to implement graph algorithms on hybrid platforms and shows that further performance gains can be extracted using partitioning strategies that aim to produce partitions that each matches the strengths of the processing element it is allocated to.
Falcon: A Graph Manipulation Language for Heterogeneous Systems
TLDR
A domain-specific language (DSL) is proposed, Falcon, for implementing graph algorithms that abstracts the hardware, provides constructs to write explicitly parallel programs at a higher level, and can work with general algorithms that may change the graph structure.
Simplifying Scalable Graph Processing with a Domain-Specific Language
TLDR
This paper uses Green-Marl, a Domain-Specific Language for graph analysis, to intuitively describe graph algorithms and extend its compiler to generate equivalent Pregel implementations, and shows that the P Regel programs generated by the Green-marl compiler perform similarly to manually coded PRegel implementations of the same algorithms.
Medusa: Simplified Graph Processing on GPUs
TLDR
This work proposes a programming framework called Medusa which enables developers to leverage the capabilities of GPUs by writing sequential C/C++ code and develops a series of graph-centric optimizations based on the architecture features of GPUs for efficiency.
Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud
TLDR
This paper develops graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency, and introduces fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm.
The energy case for graph processing on hybrid CPU and GPU systems
TLDR
This paper investigates the power, energy, and performance characteristics of large-scale graph processing on hybrid (i.e., CPU and GPU) single-node systems and shows that a hybrid system is efficient in terms of both time-to-solution and energy.
PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs
TLDR
This paper describes the challenges of computation on natural graphs in the context of existing graph-parallel abstractions and introduces the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges.
Pregel: a system for large-scale graph processing
TLDR
A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
GoFFish: A Sub-graph Centric Framework for Large-Scale Graph Analytics
TLDR
GoFFish is introduced, a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters, offering the added natural flexibility of shared memory sub- graph computation.
The tao of parallelism in algorithms
TLDR
It is suggested that the operator formulation and tao-analysis of algorithms can be the foundation of a systematic approach to parallel programming.
...
1
2
...