Towards Practical Serverless Analytics

@inproceedings{Pu2019TowardsPS,
  title={Towards Practical Serverless Analytics},
  author={Qifan Pu},
  year={2019}
}
Distributed computing remains inaccessible to a large number of users, in spite of many open source platforms and extensive commercial offerings. Even though many distributed computation frameworks have moved into the cloud, many users are still left to struggle with complex cluster management and configuration tools there.In this thesis, we argue that cloud stateless functions represent a viable platform for these users, eliminating cluster management overhead, fulfilling the promise of… CONTINUE READING

Figures, Tables, Results, and Topics from this paper.

Key Quantitative Results

  • We show that using fine-grained elasticity, Locus can reduce cluster time in terms of total core·seconds by up to 59% while being close to or beating Spark’s query completion time by up to 2×.
  • Finally, we also show that ourmodel is able to accurately predict shuffle performance and cost with an average error of 15.9% and 14.8%, respectively, which allows Locus to choose the most appropriate shuffle implementation and other configuration variables.

References

Publications referenced by this paper.
SHOWING 1-10 OF 62 REFERENCES

The Hadoop Distributed File System

  • 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
  • 2010
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

TensorFlow: A System for Large-Scale Machine Learning

Martín Abadi, Paul Barham, +19 authors Xiaoqiang Zheng
  • OSDI
  • 2016
VIEW 1 EXCERPT