Learn More
We present a new fault-tolerant algorithm for the problem of computing the connected components of a graph. Our algorithm derives from a highly parallel but non-resilient algorithm, which is based on the technique of <i>label propagation</i> (LP). To make the (LP) algorithm resilient to transient soft faults, we apply an algorithmic design principle that we(More)
—This paper presents the first sparse direct solver for distributed memory systems comprising hybrid multicore CPU and Intel Xeon Phi co-processors. It builds on the algorithmic approach of SUPERLU_DIST, which is right-looking and statically pivoted. Our contribution is a novel algorithm, called the HALO. The name is shorthand for highly asynchronous lazy(More)
  • 1