Total order broadcast for fault tolerant exascale systems

Abstract

In the process of designing a new fault tolerant run-time for future exascale systems, we discovered that a total order broadcast would be necessary. That is, nodes of a supercomputer should be able to broadcast messages to other nodes even in the face of failures. All messages should be seen in the same order at all nodes. While this is a well studied… (More)

Topics

3 Figures and Tables