A Distributed CPU-GPU Sparse Direct Solver

  title={A Distributed CPU-GPU Sparse Direct Solver},
  author={Piyush Sao and Richard W. Vuduc and Xiaoye S. Li},
This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting. While BLAS calls can account for more than 40% of the overall factorization time, the difficulty is that small problem sizes dominate the workload, making efficient GPU utilization challenging. This fact motivates our approach, which is to find ways to aggregate collections of small BLAS operations… CONTINUE READING
Highly Cited
This paper has 23 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 15 extracted citations


Publications referenced by this paper.
Showing 1-10 of 11 references

A distributed CPU-GPU sparse direct solver

  • Piyush Sao, Richard Vuduc, Xiaoye Li
  • Technical report, Georgia Institute of technology…
  • 2014
10 Excerpts

Sparse QR factorization on gpu architectures

  • S. N. Yeralan, T. Davis, S. Ranka
  • Technical report, University of Florida,
  • 2013
1 Excerpt

Similar Papers

Loading similar papers…