Exascale supercomputers will gather hundreds millions cores. The first problem that we address is resiliency and fault tolerance to reach application termination on such platforms. The second problem is energy consumption since such systems will consume enormous amount of energy. In this paper, we evaluate checkpointing and existing fault tolerance… (More)
Future supercomputers will consume enormous amounts of energy. These very large scale systems will gather many homogeneous clusters. In this paper, we analyze the power consumption of the nodes from different homogeneous clusters during different workloads. We classically observe that these nodes exhibit the same level of performance. But we also show that… (More)
Energy consumption and fault tolerance are two interrelated issues to address for designing future exascale systems. Fault tolerance protocols used for check pointing have different energy consumption depending on parameters like application features, number of processes in the execution and platform characteristics. Currently, the only way to select a… (More)
Exascale supercomputers will gather hundreds of million cores. The main problem to take care for running applications on such platforms is energy consumption since it is one major limitation if we consider that the currently fastest supercomputer consumes more than 12MW for a maximum performance of 10PFlops. Besides, we also need to overcome important… (More)
As they will gather hundreds of million cores, future exascale supercomputers will consume enormous amounts of energy. Besides being very important, their power consumption will be dynamic and irregular. Thus, in order to consume energy efficiently, powering such systems will require a permanent negotiation between the energy supplier and one of its major… (More)
Checkpointing protocols have different energy consumption depending on parameters like application features and platform characteristics. To select a protocol for a given execution, we propose an energy estimator that relies on an energy calibration of the considered platform and a user description of the execution settings.
Future supercomputers will gather hundreds of millions of communicating cores. The movement of data in such systems will be very energy consuming. We address in this paper the issue of energy consumption of data broadcasting in such large scale systems. To this end, we propose a framework to estimate the energy consumed by different MPI broadcasting… (More)