Efficient Process Replication for MPI Applications: Sharing Work between Replicas

Abstract

With the increased failure rate expected in future extreme scale supercomputers, process replication might become a viable alternative to check pointing. By default, the workload efficiency of replication is limited to 50% because of the additional resources that have to be used to execute the replicas of the application's processes. In this paper, we… (More)
DOI: 10.1109/IPDPS.2015.29

Topics

4 Figures and Tables

Slides referencing similar topics