Amelie Chi Zhou

Learn More
Recently, performance and monetary cost optimizations for workflows from various applications in the cloud have become a hot research topic. However, we find that most existing studies adopt ad hoc optimization strategies, which fail to capture the key optimization opportunities for different workloads and cloud offerings (e.g., virtual machines with(More)
Recently, we have witnessed workflows from science and other data-intensive applications emerging on Infrastructure-as-a-Service (IaaS) clouds, and many workflow service providers offering workflow-as-a-service (WaaS). The major concern of WaaS providers is to minimize the monetary cost of executing workflows in the IaaS clouds. The selection of virtual(More)
In this paper, we propose monetary cost optimizations for MPI-based applications with deadline constraints on Amazon EC2. Particularly, we consider to utilize two kinds of Amazon EC2 instances (on-demand and spot instances). As a spot instance can fail at any time due to out-of-bid events, fault tolerant executions are necessary. Through detailed studies,(More)
Resource provisioning for scientific workflows in Infrastructure-as-a-service (IaaS) clouds is an important and complicated problem for budget and performance optimizations of workflows. Scientists are facing the complexities resulting from severe cloud performance dynamics and various user requirements on performance and cost. To address those complexity(More)
I. MOTIVATION Recently, we have witnessed that many emerging high performance computing (HPC) or scientific computing applications are developed and hosted in the cloud. As those applications are usually long running jobs and are costly in the cloud, monetary cost [11], [7] and performance [3], [2] are important optimization factors. Message Passing(More)
Resource provisioning is an important and complicated problem for scientific workflows in Infrastructure-as-a-service (IaaS) clouds. Scientists are facing the complexities resulting from the diverse cloud offerings, complex workflow structures and characteristics as well as various user requirements on budget and performance. In this paper, we review the(More)
Graph partitioning is important for optimizing the performance and communication cost of large graph processing jobs. Recently, many graph applications such as social networks store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the(More)
Driven by the rapidly increasing demand for handling real-time data streams, many data stream processing (DSP) systems have been proposed. Regardless of the different architectures of those DSP systems, they are mostly aiming at scaling out using a cluster of commodity machines and built around a number of key design aspects: a) pipelined processing with(More)
—Cloud computing has recently evolved as a popular computing infrastructure for many applications. Scientific computing, which was mainly hosted in private clusters and grids, has started to migrate development and deployment to the public cloud environment. eScience as a service becomes an emerging and promising direction for science computing. We review(More)