Distributed GraphLab: A Framework for Machine Learning in the Cloud

  title={Distributed GraphLab: A Framework for Machine Learning in the Cloud},
  author={Yucheng Low and Joseph E. Gonzalez and Aapo Kyrola and Danny Bickson and Carlos Guestrin and Joseph M. Hellerstein},
  journal={Proc. VLDB Endow.},
While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. [] Key MethodWe develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency.

Figures and Tables from this paper

GDLL: A Scalable and Share Nothing Architecture based Distributed Graph Neural Networks Framework

This work proposes a scalable, layered, fault-tolerance, and in-memory distributed computing-based graph neural network framework called GDLL, which outperform it significantly in terms of efficiency while maintaining similar GNN convergence.

On Software Infrastructure for Scalable Graph Analytics

Pregelix is built, an open source distributed graph processing system which is based on an iterative dataflow design that is better tuned to handle both in-memory and out-of-core workloads and offers improved performance characteristics and scaling properties over current open source systems.

On Fault Tolerance for Distributed Iterative Dataflow Processing

This paper proposes novel fault-tolerant mechanisms for graph and machine learning analytics that run on distributed dataflow systems that outperform blocking checkpointing and complete recovery and proposes replica recovery for machine learning algorithms.

PGX.D: a fast distributed graph processing engine

This paper presents a fast distributed graph processing system, namely PGX.D, as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns and recommends the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.

GraphA: Efficient Partitioning and Storage for Distributed Graph Computation

Extensive evaluation shows that GraphA significantly outperforms state-of-the-art graph-parallel systems (GraphX and PowerLyra) in ingress time, execution time and storage cost, for both real-world and synthetic graphs.

A communication-reduced and computation-balanced framework for fast graph computation

Evaluation of LCC-Graph on a 32-node cluster, driven by real-world graph datasets, shows that it significantly outperforms existing distributed graph-processing frameworks in terms of runtime, particularly when the system is supported by a high-bandwidth network.

MOCgraph: Scalable Distributed Graph Processing Using Message Online Computing

This paper proposes MOCgraph, a scalable distributed graph processing framework to reduce the memory footprint and improve the scalability, based on message online computing, and implements it on top of Apache Giraph, and tests it against several representative graph algorithms.

Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning

This paper forms data placement as a graph partitioning problem, gives both theoretical guarantees and a highly efficient implementation and demonstrates its promising results on both text datasets and social networks.

Scaling data mining in massively parallel dataflow systems

This paper shows how to scale the mathematical operations of two popular recommendation mining algorithms, discusses an optimistic recovery mechanism that improves the performance of distributed iterative data processing, and outlines future work on efficient sample generation for scalable meta learning.

SystemML: Declarative Machine Learning on Spark

This paper describes SystemML on Apache Spark, end to end, including insights into various optimizer and runtime techniques as well as performance characteristics.



Filtering: a method for solving graph problems in MapReduce

This paper presents new algorithms in the MapReduce framework for a variety of fundamental graph problems for sufficiently dense graphs and implements the maximal matching algorithm that lies at the core of the analysis and achieves a significant speedup over the sequential version.

Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory

  • R. PearceM. GokhaleN. Amato
  • Computer Science
    2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2010
This work presents a novel asynchronous approach to compute Breadth-First-Search (BFS), Single-Source-Shortest-Paths, and Connected Components for large graphs in shared memory to overcome data latencies and provide significant speedup over alternative approaches.

Spark: Cluster Computing with Working Sets

Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.

Large graph processing in the cloud

Surfer, a large graph processing engine designed to execute in the cloud, is demonstrated and it is found that Surfer is simple to use and is highly efficient for large graph-based tasks.

PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations

This paper describes PEGASUS, an open source Peta Graph Mining library which performs typical graph mining tasks such as computing the diameter of the graph, computing the radius of each node and finding the connected components, and describes a very important primitive for PEGasUS, called GIM-V (Generalized Iterated Matrix-Vector multiplication).

Pregel: a system for large-scale graph processing

A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.

Piccolo: Building Fast, Distributed Programs with Partitioned Tables

Experiments show Piccolo to be faster than existing data flow models for many problems, while providing similar fault-tolerance guarantees and a convenient programming interface.

PrIter: A Distributed Framework for Prioritizing Iterative Computations

This paper develops a distributed computing framework, PrIter, which supports the prioritized execution of iterative computations, and shows that PrIter achieves up to 50 × speedup over Hadoop for a series ofIterative algorithms.

Dryad: distributed data-parallel programs from sequential building blocks

The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.

A Common Substrate for Cluster Computing

Nexus fosters innovation in the cloud by letting organizations run new frameworks alongside existing ones and by letting framework developers focus on specific applications rather than building one-size-fits-all frameworks.