MPIEcho: A Framework for Transparent MPI Task Replication

Abstract

In this paper we describe MPIEcho: a profiling–layer library for replicating MPI ranks independently of application parallelism. This replication effectively breaks the tight coupling between an application’s understanding of the parallel topology and that provided by the underlying MPI implementation, allowing a variety of use cases such as fault detection… (More)

Topics

2 Figures and Tables

Slides referencing similar topics