Operationalizing Machine Learning: An Interview Study

  title={Operationalizing Machine Learning: An Interview Study},
  author={Shreya Shankar and Rolando Garcia and Joseph M. Hellerstein and Aditya G. Parameswaran},
Organizations rely on machine learning engineers (MLEs) to operationalize ML, i.e., deploy and maintain ML pipelines in production. The process of operationalizing ML, or MLOps, consists of a continual loop of (i) data collection and labeling, (ii) experimentation to improve ML performance, (iii) evaluation throughout a multi-staged deployment process, and (iv) monitoring of performance drops in production. When considered together, these responsibilities seem staggering—how does anyone do… 

Figures and Tables from this paper

Decentralizing Machine Learning Operations using Web3 for IoT Platforms

The aim is to decouple the Machine Learning (ML) solution, data platform, and sensors while avoiding service level degradation and utilize an unlinkable end-to-end encrypted asynchronous communication protocol called Whisper that is based on the Ethereum blockchain to achieve sender and receiver anonymity.

An Efficient Framework for Monitoring Subgroup Performance of Machine Learning Systems

This paper mathematically formulate the problem of monitoring the performance of machine learning systems across all the data subgroups as an optimization problem with an expensive black-box objective function, and suggests to use Bayesian optimization to solve this problem.

A.I. Robustness: a Human-Centered Perspective on Technological Challenges and Opportunities

This work systematically survey the recent progress to provide a reconciled terminology of concepts around AI robustness, and introduces three taxonomies to organize and describe the literature both from a fundamental and applied point of view.

Identifying the Context Shift between Test Benchmarks and Production Data

This work introduces context shift to describe semantically meaningful changes in the underlying data generation process and identifies three methods for addressing context shift that would otherwise lead to model prediction errors.



Hidden Technical Debt in Machine Learning Systems

It is found it is common to incur massive ongoing maintenance costs in real-world ML systems, and several ML-specific risk factors to account for in system design are explored.

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

This paper defines, identifies, and presents empirical evidence on Data Cascades—compounding events causing negative, downstream effects from data issues—triggered by conventional AI/ML practices that undervalue data quality.

Bolt-on, Compact, and Rapid Program Slicing for Notebooks [Scalable Data Science]

Nbslicer is presented, a dynamic slicer optimized for the notebook setting whose instrumentation for resolving dynamic data dependencies is both bolt-on and switchable and allowing it to be selectively disabled in order to reduce instrumentation overhead.

A Machine Learning Model Helps Process Interviewer Comments in Computer-assisted Personal Interview Instruments: A Case Study

Using over 5,000 comments from the Medical Expenditure Panel Survey, features were built that were fed to a machine learning model to predict a grouping category for each comment as previously assigned by data technicians to expedite processing.

Rethinking Streaming Machine Learning Evaluation

How the nature of streaming ML problems introduces new real-world challenges and recommend additional metrics to assess streaming ML performance is discussed and recommended.

Machine Learning Operations (MLOps): Overview, Definition, and Architecture

This work conducts mixed-method research and furnishes an aggregated overview of the necessary principles, components, and roles, as well as the associated architecture and workflows of Machine Learning Operations, and furnish a definition of MLOps.

On Continuous Integration / Continuous Delivery for Automated Deployment of Machine Learning Models using MLOps

A higher perspective on the machine learning lifecycle and the vital differences between DevOps and MLOps is given and tools and techniques to execute the CI/CD pipeline of machine learning frameworks in the MLOps approach are discussed.

The CLEAR Benchmark: Continual LEArning on Real-World Imagery

This paper introduces CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts in the real world that spans a decade (2004-2014), and proposes novel "streaming" protocols for CL that always test on the (near) future.

Practices and Infrastructures for ML Systems – An Interview Study

This study investigated practices and tool-chains for ML-enabled systems from 16 organizations in various domains using interviews to make three broad observations related to data management practices, monitoring practices and automation practices in ML model training, and serving workflows.

Towards MLOps: A Framework and Maturity Model

An MLOps framework is derived that details the activities involved in the continuous development of machine learning models and the stages in which companies evolve as they become more mature and advanced and a maturity model is presented.