Tupleware: Distributed Machine Learning on Small Clusters
@article{Crotty2014TuplewareDM, title={Tupleware: Distributed Machine Learning on Small Clusters}, author={Andrew Crotty and Alex Galakatos and T. Kraska}, journal={IEEE Data Eng. Bull.}, year={2014}, volume={37}, pages={63-76} }
There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the challenges of the Googles and Facebooks of the world— petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to several terabytes in size, and perform primarily compute… CONTINUE READING
Figures, Tables, and Topics from this paper
17 Citations
KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics
- Computer Science
- 2017 IEEE 33rd International Conference on Data Engineering (ICDE)
- 2017
- 100
- PDF
In-Database Machine Learning: Gradient Descent and Tensor Algebra for Main Memory Database Systems
- Computer Science
- BTW
- 2019
- 3
- PDF
RLEX: Saftey and Data Quality in Reinforcement Learning-based and Adaptive Systems
- Computer Science
- CIDR
- 2017
- PDF
ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning
- Computer Science
- SIGMOD Conference
- 2016
- 26
- PDF
Two Decades of AI4NETS - AI/ML for Data Networks: Challenges & Research Directions
- Computer Science
- NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium
- 2020
- PDF
References
SHOWING 1-10 OF 35 REFERENCES
HaLoop: Efficient Iterative Data Processing on Large Clusters
- Computer Science
- Proc. VLDB Endow.
- 2010
- 848
- Highly Influential
- PDF
Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads
- Computer Science
- Proc. VLDB Endow.
- 2012
- 490
- PDF
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
- Computer Science
- NSDI
- 2012
- 3,306
- PDF
SystemML: Declarative machine learning on MapReduce
- Computer Science
- 2011 IEEE 27th International Conference on Data Engineering
- 2011
- 290
- PDF
SCOPE: easy and efficient parallel processing of massive data sets
- Computer Science
- Proc. VLDB Endow.
- 2008
- 802
- PDF