Analyzing Checkpointing Trends for Applications on the IBM Blue Gene/P System

  title={Analyzing Checkpointing Trends for Applications on the IBM Blue Gene/P System},
  author={Harish Gapanati Naik and Rinku Gupta and Peter H. Beckman},
  journal={2009 International Conference on Parallel Processing Workshops},
Current petascale systems have tens of thousands of hardware components and complex system software stacks, which increase the probability of faults occurring during the lifetime of a process. Checkpointing has been a popular method of providing fault tolerance in high-end systems. While considerable research has been done to optimize checkpointing, in practice the method still involves a high-cost overhead for users. In this paper, we study the checkpointing overhead seen by applications… CONTINUE READING
Highly Cited
This paper has 21 citations. REVIEW CITATIONS
15 Extracted Citations
27 Extracted References
Similar Papers

Citing Papers

Publications influenced by this paper.
Showing 1-10 of 15 extracted citations

Referenced Papers

Publications referenced by this paper.

Similar Papers

Loading similar papers…