Irregular applications pose challenges in optimizing communication, due to the difficulty of analyzing irregular data accesses accurately and efficiently. This challenge is especially big when translating irregular shared-memory applications to message-passing form for clusters. The lack of effective irregular data analysis in the translation system results in unnecessary or redundant communication, which limits application scalability. In this paper, we present a Lean Distributed Shared Memory… CONTINUE READING
Figure 5: Nonzero Distribution of Sparse Input Matrix in SPMUL, EQUAKE, and CG: the percentage of nonzero elements is less than 1% in all three sparse input matrices. In SPMUL and EQUAKE, the output of the sparse matrix-vector multiplication in each processor is also sparse, however, CG produces a dense output due to its random nonzero distributions in the input matrix.