Learn More
—In this paper, we describe our experiment developing an implementation of the Linpack benchmark for TianHe-1, a petascale CPU/GPU supercomputer system, the largest GPU-accelerated system ever attempted before. An adaptive optimization framework is presented to balance the workload distribution across the GPUs and CPUs with the negligible runtime overhead,(More)
This paper proposes an optimization method of data saving for application-level checkpointing based on the live-variable analysis method for MPI programs. We presents the implementation of a source-to-source precompiler (CAC) for automating applicationlevel checkpointing based on the optimization method. The experiment shows that CAC is capable of(More)