Performance Implications of Failures on MapReduce Applications
Parallel computing is seeing increasing use in critical applications. The need therefore arises to test the robustness of parallel applications in the presence of exceptional conditions, or faults. Communication-software-based fault injection is an extremely flexible approach to robustness testing in message-passing parallel computers. A fault injection methodology and tool that use this approach are presented. The tool, known as FIMD-MPI, allows injection of faults into MPI-based applications. The structure and operation of FIMD-MPI are described and the use of the tool is illustrated on an example fault-tolerant MPI application.