Petuum: A New Platform for Distributed Machine Learning on Big Data
@article{Xing2015PetuumAN, title={Petuum: A New Platform for Distributed Machine Learning on Big Data}, author={E. Xing and Q. Ho and Wei Dai and J. Kim and Jinliang Wei and S. Lee and X. Zheng and Pengtao Xie and Abhimanu Kumar and Y. Yu}, journal={IEEE Transactions on Big Data}, year={2015}, volume={1}, pages={49-67} }
What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100 s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized graph-based execution that relies on graph representations of ML programs. The variety of approaches… CONTINUE READING
Supplemental Presentations
Topics from this paper
205 Citations
KunPeng: Parameter Server based Distributed Learning Systems and Its Applications in Alibaba and Ant Financial
- Computer Science
- KDD
- 2017
- 28
- PDF
A Collective Communication Layer for the Software Stack of Big Data Analytics
- Computer Science
- 2016 IEEE International Conference on Cloud Engineering Workshop (IC2EW)
- 2016
- 1
- PDF
Strategies and Principles of Distributed Machine Learning on Big Data
- Computer Science, Mathematics
- ArXiv
- 2015
- 67
- PDF
Collaborative Filtering as a Case-Study for Model Parallelism on Bulk Synchronous Systems
- Computer Science
- CIKM
- 2017
- 2
- Highly Influenced
- PDF
Litz: Elastic Framework for High-Performance Distributed Machine Learning
- Computer Science
- USENIX Annual Technical Conference
- 2018
- 29
- PDF
SparCML: high-performance sparse communication for machine learning
- Computer Science, Mathematics
- SC
- 2019
- 45
- PDF
BLAS-on-flash: An Efficient Alternative for Large Scale ML Training and Inference?
- Computer Science
- NSDI
- 2019
- PDF
Towards MapReduce based Bayesian deep learning network for monitoring big data applications
- Computer Science
- 2017 IEEE International Conference on Big Data (Big Data)
- 2017
- 4
MapReduce Based Classification for Fault Detection in Big Data Applications
- Computer Science
- 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)
- 2017
References
SHOWING 1-7 OF 7 REFERENCES
Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud
- 2012
- 1,249
- Highly Influential
- PDF
Scaling Distributed Machine Learning with the Parameter Server
- Computer Science
- BigDataScience '14
- 2014
- 1,048
- Highly Influential
- PDF
Feature Clustering for Accelerating Parallel Coordinate Descent
- Computer Science, Mathematics
- NIPS
- 2012
- 71
- Highly Influential
- PDF
ImageNet: A large-scale hierarchical image database
- 2009 IEEE Conference on Computer Vision and Pattern Recognition
- 2009
- 6,950
- Highly Influential
- PDF
Parallel Coordinate Descent for L1-Regularized Loss Minimization
- Computer Science, Mathematics
- ICML
- 2011
- 296
- Highly Influential
- PDF