Rafiki: Machine Learning as an Analytics Service System

@article{Wang2018RafikiML,
  title={Rafiki: Machine Learning as an Analytics Service System},
  author={Wei Wang and Sheng Wang and Jinyang Gao and Meihui Zhang and Gang Chen and Teck Khim Ng and Beng Chin Ooi},
  journal={ArXiv},
  year={2018},
  volume={abs/1804.06087}
}
Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications.Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake and stock movement prediction. Extending traditional database systems to support the… 

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

This work provides a comprehensive review of recent works focusing on utilizing DRL to improve data processing and analytics, and presents an introduction to key concepts, theories, and methods in DRL.

End-to-end Optimization of Machine Learning Prediction Queries

Raven follows the enterprise architectural trend of collocating data and ML runtimes and employs logical-to-physical transformations that allow operators to be executed on different run-times (relational, ML, and DNN) and hardware (CPU, GPU).

Enabling Cost-Effective, SLO-Aware Machine Learning Inference Serving on Public Cloud

MArk (Model Ark), a general-purpose inference serving system, is proposed to tackle the dual challenge of SLO compliance and cost effectiveness and evaluated the performance of MArk using several state-of-the-art ML models trained in TensorFlow, MXNet, and Keras.

AutoML to Date and Beyond: Challenges and Opportunities

A new classification system for AutoML systems is introduced, using a seven-tiered schematic to distinguish these systems based on their level of autonomy, and a novel level-based taxonomy is introduced to define each level according to the scope of automation support provided.

MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving

This paper tackles the dual challenge of SLO compliance and cost effectiveness with MArk (Model Ark), a general-purpose inference serving system built in Amazon Web Services (AWS), and evaluated the performance of MArk using several state-of-the-art ML models trained in popular frameworks including TensorFlow, MXNet, and Keras.

Serverless Model Serving for Data Science

It is found that serverless outperforms many cloud-based alternatives with respect to cost and performance, and can even outperform GPU-based systems for both average latency and cost.

MLIoT: An End-to-End Machine Learning System for the Internet-of-Things

This work proposes MLIoT, an end-to-end Machine Learning System tailored towards supporting the entire lifecycle of IoT applications, and compares it with two state-of-the-art hand-tuned systems and a commercial ML system showing that MLIeT improves accuracy from 50% - 75% while reducing or maintaining latency.

A Case for Managed and Model-less Inference Serving

This paper defines and makes the case for managed and model-less inference serving, and identifies and discusses open research directions to realize this vision.

Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches

This paper characterize the particular suitability of MOP for DL on data systems, but to bring MOP-based DL to DB-resident data, it is shown that there is no single "best" approach, and an interesting tradeoff space of approaches exists.
...

References

SHOWING 1-10 OF 43 REFERENCES

How good are machine learning clouds for binary classification with good features?: extended abstract

In spite of the recent advancement of machine learning research, modern machine learning systems are still far from easy to use, at least from the perspective of business users or even scientists

Deep Learning

Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

Clipper: A Low-Latency Online Prediction Serving System

Clipper is introduced, a general-purpose low-latency prediction serving system that introduces a modular architecture to simplify model deployment across frameworks and applications and improves prediction throughput, accuracy, and robustness without modifying the underlying machine learning frameworks.

Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads

A novel algorithm is developed that combines multi-armed bandits with Bayesian optimization and proves a regret bound under the multi-tenant setting, aiming for minimizing the total regret of all users running automatic model selection tasks.

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

TensorFlow Extended (TFX) is presented, a TensorFlow-based general-purpose machine learning platform implemented at Google that was able to standardize the components, simplify the platform configuration, and reduce the time to production from the order of months to weeks, while providing platform stability that minimizes disruptions.

TensorFlow: A system for large-scale machine learning

The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated.

ATM: A distributed, collaborative, scalable system for automated machine learning

The initial results show ATM can beat human-generated solutions for 30% of the datasets, and can do so in 1/100th of the time, and the usefulness of ATM is demonstrated.

SINGA: A Distributed Deep Learning Platform

A distributed deep learning system, called SINGA, for training big models over large datasets, which supports a variety of popular deep learning models and provides different neural net partitioning schemes for training large models.

Machine Learning: The High Interest Credit Card of Technical Debt

The goal of this paper is highlight several machine learning specific risk factors and design patterns to be avoided or refactored where possible, including boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, changes in the external world, and a variety of system-level anti-patterns.

PANDA: Facilitating Usable AI Development

A new perspective on developing AI solutions is taken, and a solution for making AI usable is presented that will enable all subject matter experts (eg. Clinicians) to exploit AI like data scientists.