#### Filter Results:

#### Publication Year

2013

2016

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

We show how to use the idea of <i>self-stabilization</i>, which originates in the context of distributed control, to make fault-tolerant iterative solvers. Generally, a self-stabilizing system is one that, starting from an arbitrary state (valid or invalid), reaches a valid state within a finite number of steps. This property imbues the system with a… (More)

This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting. While BLAS calls can account for more than 40% of the overall fac-torization time, the difficulty is that small problem sizes dominate the workload, making… (More)

We present a new fault-tolerant algorithm for the problem of computing the connected components of a graph. Our algorithm derives from a highly parallel but non-resilient algorithm, which is based on the technique of <i>label propagation</i> (LP). To make the (LP) algorithm resilient to transient soft faults, we apply an algorithmic design principle that we… (More)

—This paper presents the first sparse direct solver for distributed memory systems comprising hybrid multicore CPU and Intel Xeon Phi co-processors. It builds on the algorithmic approach of SUPERLU_DIST, which is right-looking and statically pivoted. Our contribution is a novel algorithm, called the HALO. The name is shorthand for highly asynchronous lazy… (More)

- ‹
- 1
- ›