CWD: A Machine Learning based Approach to Detect Unknown Cloud Workloads

  title={CWD: A Machine Learning based Approach to Detect Unknown Cloud Workloads},
  author={Mohammad Hossain and Derssie Mebratu and Niranjan Hasabnis and Jun Jin and Gaurav Chaudhary and Noah Shen},
—Workloads in modern cloud data centers are becom- ing increasingly complex. The number of workloads running in cloud data centers has been growing exponentially for the last few years, and cloud service providers (CSP) have been supporting on-demand services in real-time. Realizing the growing complex- ity of cloud environment and cloud workloads, hardware vendors such as Intel and AMD are increasingly introducing cloud-specific workload acceleration features in their CPU platforms. These… 

Figures and Tables from this paper



Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS

The realization of a cloud workload prediction module for SaaS providers based on the autoregressive integrated moving average (ARIMA) model is presented and its accuracy of future workload prediction is evaluated using real traces of requests to Web servers.

Autonomic Characterization of Workloads Using Workload Fingerprinting

This paper describes the method to develop a multi-variate phase model by learning and classifying the run-time behavior of workloads and demonstrates the workload phase forecasting method using phase extraction using a combination of machine learning approach.

Phase Annotated Learning for Apache Spark: Workload Recognition and Characterization

This paper profile and annotate resource usage data in Spark with the application contexts where the resources were used and model the resource usage, per context, based on a Mixture of Gaussians (MOG) probabilistic distribution technique.

Forecasting for Grid and Cloud Computing On-Demand Resources Based on Pattern Matching

An approach to the problem of workload prediction based on identifying similar past occurrences of the current short-term workload history is proposed, and a Cloud client resource auto-scaling algorithm is presented that uses this approach to help when scaling decisions are made.

Phase Aware Performance Modeling for Cloud Applications

This paper proposes a new methodology for performance modeling of applications deployed in the cloud based on automatically discovered phases along with their inputs that can predict the performance of applications with up to 95% accuracy for previously unseen input configurations at less than 5% overhead.

Characterizing Computer Systems' Workloads

This paper surveys workload characterization techniques used for several types of computer systems, identifies significant issues and concerns encountered during the characterization process and proposes an augmented methodology for workload characterization as a framework.

Online Phase Detection and Characterization of Cloud Applications

This paper introduces a new methodology for automatic phase detection and characterization for applications running on the cloud that is non-intrusive, more general, lightweight and can detect phase changes online as the application runs.

Phase tracking and prediction

This paper presents a unified profiling architecture that can efficiently capture, classify, and predict phase-based program behavior on the largest of time scales, and can capture phases that account for over 80% of execution using less that 500 bytes of on-chip memory.

Analysis of Dimensionality Reduction Techniques on Big Data

Two of the prominent dimensionality reduction techniques, Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) are investigated on four popular Machine Learning (ML) algorithms using publicly available Cardiotocography dataset from University of California and Irvine Machine Learning Repository to prove that PCA outperforms LDA in all the measures.

An Evaluation of Change Point Detection Algorithms

This study shows that binary segmentation and Bayesian online change point detection are among the best performing methods.