Agile dynamic provisioning of multi-tier Internet applications

  title={Agile dynamic provisioning of multi-tier Internet applications},
  author={Bhuvan Urgaonkar and Prashant J. Shenoy and Abhishek Chandra and Pawan Goyal and Timothy Wood},
  journal={ACM Trans. Auton. Adapt. Syst.},
Dynamic capacity provisioning is a useful technique for handling the multi-time-scale variations seen in Internet workloads. In this article, we propose a novel dynamic provisioning technique for multi-tier Internet applications that employs (1) a flexible queuing model to determine how much of the resources to allocate to each tier of the application, and (2) a combination of predictive and reactive methods that determine when to provision these resources, both at large and small time scales… 

A Heuristic Approach for Scalability of Multi-tiers Web Application in Clouds

The results indicate that the performance model faithfully captures the behaviors of multi-tiers applications over a various range of workloads and configuration schemes and shows that the techniques can judiciously obtain the optimized configuration scheme effectively with modest computation.

Dynamic Provisioning and Resource Management for Multi-Tier Cloud Based Applications

This paper dynamically increases the mean service rate of the virtual machines to avoid congestion in the multi-tier environments and obtains the system-length distributions at pre-arrival and arbitrary epochs.

Provisioning multi-tier cloud applications using statistical bounds on sojourn time

This paper model the multi-tier application as an open tandem network of M/G/1-PS queues and develops a method that produces a near optimal application configuration to meet the percentile bound in a homogeneous server environment -- using a single type of server.

Regression based multi-tier resource provisioning for session slowdown guarantees

Experiments using the industry standard TPC-W benchmark demonstrate the effectiveness and efficiency of the regression based dynamic resource provisioning approach in meeting the session slowdown guarantees of a multi-tier e-commerce application.

Cost-Aware Performance Modeling of Multi-tier Web Applications in the Cloud

Having such performance models enables understanding the trade-off between performance and cost, a cornerstone in developing dynamic provisioning performance management schemes, in the context of cloud computing (Amazon EC2).

Regression-based resource provisioning for session slowdown guarantee in multi-tier Internet servers

Dynamic Provisioning Modeling for Virtualized Multi-tier Applications in Cloud Data Center

A novel dynamic provisioning technique for a cluster-based virtualized multi-tier application that employ a flexible hybrid queueing model to determine the number of virtual machines at each tier in a virtualized application is presented.

Scalability and performance management of internet applications in the cloud

A system which finds and eliminates the VMs suffering from performance interference and adopts proactive scalability can mitigate 88% of the resources provisioning overhead impact with only a 9% increase in the cost.



An analytical model for multi-tier internet services and its applications

This paper presents a model based on a network of queues, where the queues represent different tiers of the application, sufficiently general to capture the behavior of tiers with significantly different performance characteristics and application idiosyncrasies such as session-based workloads, concurrency limits, and caching at intermediate tiers.

QoS-driven server migration for Internet data centers

This paper develops a framework for QoS-driven dynamic resource allocation in IDCs, called QuID (quality of service infrastructure on demand), and develops an optimal off-line algorithm that bounds the advantage of any dynamic policy and provides a benchmark for performance evaluation.

Dynamic resource allocation for shared data centers using online measurements

The main advantage of the techniques is that they capture the transient behavior of applications while incorporating nonlinearity in the system model, and can judiciously allocate system resources, especially under transient overload conditions.

Dynamic resource allocation for shared data centers using online measurements

A system architecture that combines online measurements with prediction and resource allocation techniques to react to changing workloads by dynamically varying the resource shares of applications and can handle nonlinearity in system behavior unlike some prior techniques.

Resource overbooking and application profiling in shared hosting platforms

By overbooking cluster resources in a controlled fashion, this platform can provide performance guarantees to applications even when overbooked, and combine these techniques with commonly used QoS resource allocation mechanisms to provide application isolation and performance guarantees at run-time.

SEDA: an architecture for well-conditioned, scalable internet services

This work presents the SEDA design and an implementation of an Internet services platform based on this architecture, and describes several control mechanisms for automatic tuning and load conditioning, including thread pool sizing, event batching, and adaptive load shedding.

Adaptive Overload Control for Busy Internet Servers

  • M. WelshD. Culler
  • Computer Science, Business
    USENIX Symposium on Internet Technologies and Systems
  • 2003
This paper presents a set of techniques for managing overload in complex, dynamic Internet services based on an adaptive admission control mechanism that attempts to bound the 90th-percentile response time of requests flowing through the service.

Resource Allocation for Autonomic Data Centers using Analytic Performance Models

  • M. BennaniD. Menascé
  • Computer Science
    Second International Conference on Autonomic Computing (ICAC'05)
  • 2005
This paper presents a solution based on the use of analytic queuing network models combined with combinatorial search techniques to dynamically redeploy servers among the various AEs in order to optimize some global utility function.

Cluster-based scalable network services

A general, layered architecture for building cluster-based scalable network services that encapsulates the above requirements for reuse, and a service-programming model based on composable workers that perform transformation, aggregation, caching, and customization (TACC) of Internet content is proposed.

Layered queueing models for enterprise JavaBean applications

A layered queueing model for predicting the performance of distributed enterprise applications built on Enterprise JavaBeans (EJB) technology is proposed and it is shown how such models can be applied for capacity sizing of distributed Enterprise systems.