Proactive Fault Tolerance in MPI Applications Via Task Migration

Failures are likely to be more frequent in systems with thousands of processors. Therefore, schemes for dealing with faults become increasingly important. In this paper, we present a fault tolerance solution for parallel applications that proactively migrates execution from processors where failure is imminent. Our approach assumes that some failures are… CONTINUE READING

6 Figures & Tables

Topics

Statistics

051015'06'07'08'09'10'11'12'13'14'15'16'17'18
Citations per Year

88 Citations

Semantic Scholar estimates that this publication has 88 citations based on the available data.

See our FAQ for additional information.