Distributed GraphLab: A Framework for Machine Learning in the Cloud
@article{Low2012DistributedGA, title={Distributed GraphLab: A Framework for Machine Learning in the Cloud}, author={Yucheng Low and Joseph E. Gonzalez and Aapo Kyrola and Danny Bickson and Carlos Guestrin and Joseph M. Hellerstein}, journal={Proc. VLDB Endow.}, year={2012}, volume={5}, pages={716-727} }
While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. [] Key MethodWe develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency.
Figures and Tables from this paper
746 Citations
GDLL: A Scalable and Share Nothing Architecture based Distributed Graph Neural Networks Framework
- Computer ScienceIEEE Access
- 2022
This work proposes a scalable, layered, fault-tolerance, and in-memory distributed computing-based graph neural network framework called GDLL, which outperform it significantly in terms of efficiency while maintaining similar GNN convergence.
On Software Infrastructure for Scalable Graph Analytics
- Computer Science
- 2015
Pregelix is built, an open source distributed graph processing system which is based on an iterative dataflow design that is better tuned to handle both in-memory and out-of-core workloads and offers improved performance characteristics and scaling properties over current open source systems.
On Fault Tolerance for Distributed Iterative Dataflow Processing
- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2017
This paper proposes novel fault-tolerant mechanisms for graph and machine learning analytics that run on distributed dataflow systems that outperform blocking checkpointing and complete recovery and proposes replica recovery for machine learning algorithms.
PGX.D: a fast distributed graph processing engine
- Computer ScienceSC15: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2015
This paper presents a fast distributed graph processing system, namely PGX.D, as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns and recommends the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.
GraphA: Efficient Partitioning and Storage for Distributed Graph Computation
- Computer ScienceIEEE Transactions on Services Computing
- 2021
Extensive evaluation shows that GraphA significantly outperforms state-of-the-art graph-parallel systems (GraphX and PowerLyra) in ingress time, execution time and storage cost, for both real-world and synthetic graphs.
A communication-reduced and computation-balanced framework for fast graph computation
- Computer ScienceFrontiers of Computer Science
- 2018
Evaluation of LCC-Graph on a 32-node cluster, driven by real-world graph datasets, shows that it significantly outperforms existing distributed graph-processing frameworks in terms of runtime, particularly when the system is supported by a high-bandwidth network.
MOCgraph: Scalable Distributed Graph Processing Using Message Online Computing
- Computer ScienceProc. VLDB Endow.
- 2014
This paper proposes MOCgraph, a scalable distributed graph processing framework to reduce the memory footprint and improve the scalability, based on message online computing, and implements it on top of Apache Giraph, and tests it against several representative graph algorithms.
Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning
- Computer ScienceArXiv
- 2015
This paper forms data placement as a graph partitioning problem, gives both theoretical guarantees and a highly efficient implementation and demonstrates its promising results on both text datasets and social networks.
Scaling data mining in massively parallel dataflow systems
- Computer ScienceSIGMOD'14 PhD Symposium
- 2014
This paper shows how to scale the mathematical operations of two popular recommendation mining algorithms, discusses an optimistic recovery mechanism that improves the performance of distributed iterative data processing, and outlines future work on efficient sample generation for scalable meta learning.
SystemML: Declarative Machine Learning on Spark
- Computer ScienceProc. VLDB Endow.
- 2016
This paper describes SystemML on Apache Spark, end to end, including insights into various optimizer and runtime techniques as well as performance characteristics.
References
SHOWING 1-10 OF 39 REFERENCES
Filtering: a method for solving graph problems in MapReduce
- Computer ScienceSPAA '11
- 2011
This paper presents new algorithms in the MapReduce framework for a variety of fundamental graph problems for sufficiently dense graphs and implements the maximal matching algorithm that lies at the core of the analysis and achieves a significant speedup over the sequential version.
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
- Computer Science2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
- 2010
This work presents a novel asynchronous approach to compute Breadth-First-Search (BFS), Single-Source-Shortest-Paths, and Connected Components for large graphs in shared memory to overcome data latencies and provide significant speedup over alternative approaches.
Spark: Cluster Computing with Working Sets
- Computer ScienceHotCloud
- 2010
Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Large graph processing in the cloud
- Computer ScienceSIGMOD Conference
- 2010
Surfer, a large graph processing engine designed to execute in the cloud, is demonstrated and it is found that Surfer is simple to use and is highly efficient for large graph-based tasks.
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
- Computer Science2009 Ninth IEEE International Conference on Data Mining
- 2009
This paper describes PEGASUS, an open source Peta Graph Mining library which performs typical graph mining tasks such as computing the diameter of the graph, computing the radius of each node and finding the connected components, and describes a very important primitive for PEGasUS, called GIM-V (Generalized Iterated Matrix-Vector multiplication).
Pregel: a system for large-scale graph processing
- Computer ScienceSIGMOD Conference
- 2010
A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Piccolo: Building Fast, Distributed Programs with Partitioned Tables
- Computer ScienceOSDI
- 2010
Experiments show Piccolo to be faster than existing data flow models for many problems, while providing similar fault-tolerance guarantees and a convenient programming interface.
PrIter: A Distributed Framework for Prioritizing Iterative Computations
- Computer ScienceIEEE Transactions on Parallel and Distributed Systems
- 2013
This paper develops a distributed computing framework, PrIter, which supports the prioritized execution of iterative computations, and shows that PrIter achieves up to 50 × speedup over Hadoop for a series ofIterative algorithms.
Dryad: distributed data-parallel programs from sequential building blocks
- Computer ScienceEuroSys '07
- 2007
The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.
A Common Substrate for Cluster Computing
- Computer ScienceHotCloud
- 2009
Nexus fosters innovation in the cloud by letting organizations run new frameworks alongside existing ones and by letting framework developers focus on specific applications rather than building one-size-fits-all frameworks.