A reliability task scheduling algorithm with optimizing makespan in heterogeneous systems

Abstract

Fault tolerance and the makespan (or the schedule length) are important requirements in several distributed heterogeneous systems. In this paper we propose a fault tolerant scheduling heuristics for precedence task which is based on primary-backup replication scheme. We focus on a bi-criteria approach, where we aim at minimizing makespan, and the other way take into account the failure probability of the application. We are able to let the user choose a trade-off between reliability maximization and makespan minimization. Major achievements include a low complexity and reduction of the number of additional communications included by the replication and clustering mechanism. Simulation results show that compared with existing scheduling algorithms in the literature, our scheduling algorithm improves the reliability and performance.

3 Figures and Tables

Cite this paper

@article{WeiPeng2012ART, title={A reliability task scheduling algorithm with optimizing makespan in heterogeneous systems}, author={Jing Wei-Peng and Wu Zhi-bo and Liu Hong-wei and Dong Jian}, journal={World Automation Congress 2012}, year={2012}, pages={409-413} }