Algorithmic Cholesky factorization fault recovery

  title={Algorithmic Cholesky factorization fault recovery},
  author={Douglas Hakkarinen and Zizhong Chen},
  journal={2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)},
Modeling and analysis of large scale scientific systems often use linear least squares regression, frequently employing Cholesky factorization to solve the resulting set of linear equations. With large matrices, this often will be performed in high performance clusters containing many processors. Assuming a constant failure rate per processor, the probability of a failure occurring during the execution increases linearly with additional processors. Fault tolerant methods attempt to reduce the… CONTINUE READING
Highly Cited
This paper has 33 citations. REVIEW CITATIONS