Exploiting Workload Characteristics and Service Diversity to Improve the Availability of Cloud Storage Systems

Abstract

With the increasing utilization and popularity of the cloud infrastructure, more and more data are moved to the cloud storage systems. This makes the availability of cloud storage services critically important, particularly given the fact that outages of cloud storage services have indeed happened from time to time. Thus, solely depending on a single cloud storage provider for storage services can risk violating the service-level agreement (SLA) due to the weakening of service availability. This has led to the notion of Cloud-of-Clouds, where data redundancy is introduced to distribute data among multiple independent cloud storage providers, to address the problem. The key in the effectiveness of the Cloud-of-Clouds approaches lies in how the data redundancy is incorporated and distributed among the clouds. However, the existing Cloud-of-Clouds approaches utilize either replication or erasure codes to redundantly distribute data across multiple clouds, thus incurring either high space or high performance overheads. In this paper, we propose a hybrid redundant data distribution approach, called HyRD, to improve the cloud storage availability in Cloud-of-Clouds by exploiting the workload characteristics and the diversity of cloud providers. In HyRD, large files are distributed in multiple cost-efficient cloud storage providers with erasure-coded data redundancy while small files and file system metadata are replicated on multiple high-performance cloud storage providers. The experiments conducted on our lightweight prototype implementation of HyRD show that HyRD improves the cost efficiency by 33.4 and 20.4 percent, and reduces the access latency by 58.7 and 34.8 percent than the DuraCloud and RACS schemes, respectively.

DOI: 10.1109/TPDS.2015.2475273

13 Figures and Tables

Cite this paper

@article{Mao2016ExploitingWC, title={Exploiting Workload Characteristics and Service Diversity to Improve the Availability of Cloud Storage Systems}, author={Bo Mao and Suzhen Wu and Hong Jiang}, journal={IEEE Transactions on Parallel and Distributed Systems}, year={2016}, volume={27}, pages={2010-2021} }