Amelie Chi Zhou

Learn More
Recently, we have witnessed workflows from science and other data-intensive applications emerging on Infrastructure-as-a-Service (IaaS) clouds, and many workflow service providers offering workflow-as-a-service (WaaS). The major concern of WaaS providers is to minimize the monetary cost of executing workflows in the IaaS clouds. The selection of virtual(More)
Recently, performance and monetary cost optimizations for workflows from various applications in the cloud have become a hot research topic. However, we find that most existing studies adopt ad hoc optimization strategies, which fail to capture the key optimization opportunities for different workloads and cloud offerings (e.g., virtual machines with(More)
In this paper, we propose monetary cost optimizations for MPI-based applications with deadline constraints on Amazon EC2. Particularly, we consider to utilize two kinds of Amazon EC2 instances (on-demand and spot instances). As a spot instance can fail at any time due to out-of-bid events, fault tolerant executions are necessary. Through detailed studies,(More)
Resource provisioning for scientific workflows in Infrastructure-as-a-service (IaaS) clouds is an important and complicated problem for budget and performance optimizations of workflows. Scientists are facing the complexities resulting from severe cloud performance dynamics and various user requirements on performance and cost. To address those complexity(More)
Driven by the rapidly increasing demand for handling real-time data streams, many data stream processing (DSP) systems have been proposed. Regardless of the different architectures of those DSP systems, they are mostly aiming at scaling out using a cluster of commodity machines and built around a number of key design aspects: a) pipelined processing with(More)
I. MOTIVATION Recently, we have witnessed that many emerging high performance computing (HPC) or scientific computing applications are developed and hosted in the cloud. As those applications are usually long running jobs and are costly in the cloud, monetary cost [11], [7] and performance [3], [2] are important optimization factors. Message Passing(More)
Resource provisioning is an important and complicated problem for scientific workflows in Infrastructure-as-a-service (IaaS) clouds. Scientists are facing the complexities resulting from the diverse cloud offerings, complex workflow structures and characteristics as well as various user requirements on budget and performance. In this paper, we review the(More)
Graph partitioning is important for optimizing the performance and communication cost of large graph processing jobs. Recently, many graph applications such as social networks store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the(More)
Solid state drives (SSDs), or flash disks have been considered as ideal storage for various data-intensive workloads, because of the low random access latency and the intra-disk multi-chip parallelism. However, due to inherent nature of flash memories, update-intensive workloads cause the flash disk fragmented, and trigger costly internal activities such as(More)