Towards a more fault resilient multigrid solver

Abstract

The effectiveness of sparse, linear solvers is typically studied in terms of their convergence properties and computational complexity, while their ability to handle transient hardware errors, such as bit-flips that lead to silent data corruption (SDC), has received less attention. As supercomputers continue to add more cores to increase performance, they… (More)

Topics

10 Figures and Tables

Cite this paper

@inproceedings{Calhoun2015TowardsAM, title={Towards a more fault resilient multigrid solver}, author={Jon Calhoun and Luke N. Olson and Marc Snir and William Gropp}, booktitle={SpringSim}, year={2015} }