An Efficient Intermediate Data Fault-Tolerance Approach in the Cloud

Abstract

Recently, cloud computing frameworks have gained popularity for processing large scale parallel data applications. They usually generate enormous amounts of intermediate data which are short-lived, yet are important for the completion of job. Once there are server failures, it leads to the failures of the intermediate data, and then affects the computation of the whole job. However, the existing fault-tolerant processing approaches only adopt simple replication strategies which can incur significant network overhead, and have no considering of the characteristics of the intermediate data. Therefore, in this paper, we propose an efficient supporting intermediate data fault-tolerant cloud computing framework, named IDF_Support framework. By dividing the computing tasks into different classifications, IDF_Support framework can effectively process the intermediate data failures. Then, two levels based intermediate data fault-tolerant algorithms are proposed, respectively the inner task intermediate data fault-tolerant algorithm (Inner task IDF) which resolves the fault-tolerance within a task, and the outer task intermediate data fault-tolerant algorithm (Outer task IDF) which resolves the fault-tolerance among tasks. The experimental results show that our algorithms keep the reliability of the system when there are server failures.

DOI: 10.1109/WISA.2014.44

7 Figures and Tables

Cite this paper

@article{Song2014AnEI, title={An Efficient Intermediate Data Fault-Tolerance Approach in the Cloud}, author={Baoyan Song and Cai Ren and Xuecheng Li and Linlin Ding}, journal={2014 11th Web Information System and Application Conference}, year={2014}, pages={203-206} }