# Reducing Communication Costs for Sparse Matrix Multiplication within Algebraic Multigrid

@article{Ballard2016ReducingCC, title={Reducing Communication Costs for Sparse Matrix Multiplication within Algebraic Multigrid}, author={Grey Ballard and Christopher M. Siefert and Jonathan J. Hu}, journal={SIAM J. Sci. Comput.}, year={2016}, volume={38} }

We consider the sequence of sparse matrix-matrix multiplications performed during the setup phase of algebraic multigrid. In particular, we show that the most commonly used parallel algorithm is often not the most communication-efficient one for all of the matrix-matrix multiplications involved. By using an alternative algorithm, we show that the communication costs are reduced (in theory and practice), and we demonstrate the performance benefit for both model (structured) and more realistic…

## Figures and Tables from this paper

## 16 Citations

Parallel memory-efficient all-at-once algorithms for the sparse matrix triple products in multigrid methods

- Computer ScienceArXiv
- 2019

Two new algorithms are proposed that construct a coarse matrix with taking one pass through the input matrices without involving any auxiliary matrices for saving memory, and are perfectly scalable in both the compute time and the memory usage.

Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication

- Computer ScienceSPAA
- 2015

This paper characterize the communication cost of a sparse matrix-matrix multiplication algorithm in terms of the size of a cut of an associated hypergraph that encodes the computation for a given input nonzero structure.

Hypergraph Partitioning for Sparse Matrix-Matrix Multiplication

- Computer ScienceTOPC
- 2016

It is shown that identifying a communication-optimal algorithm for particular input matrices is equivalent to solving a hypergraph partitioning problem, and hypergraphs are an accurate model for reasoning about the communication costs of SpGemM as well as a practical tool for exploring the SpGEMM algorithm design space.

αSetup-AMG: an adaptive-setup-based parallel AMG solver for sequence of sparse linear systems

- Computer ScienceCCF Trans. High Perform. Comput.
- 2020

The main idea behind αSetup-AMG is the introduction of a setup condition in the coarsening process so that the setup is constructed as it needed instead of constructing in advance via an independent phase in the traditional procedure.

TileSpGEMM: a tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs

- Computer SciencePPoPP
- 2022

This paper proposes a tiled parallel SpGEMM algorithm that sparsifies the tiled method in dense general matrix-matrix multiplication, and saves each non-empty tile in a sparse form, and outperforms four state-of-the-art SpGemM methods.

A Systematic Survey of General Sparse Matrix-Matrix Multiplication

- Computer ScienceArXiv
- 2020

An experimentally comparative study of existing implementations on CPU and GPU of SpGEMM optimization from 1977 to 2019 is presented and highlights future research directions and how future studies can leverage the findings to encourage better design and implementation.

A Parallel Implementation of a Two-Level Overlapping Schwarz Method with Energy-Minimizing Coarse Space Based on Trilinos

- Computer ScienceSIAM J. Sci. Comput.
- 2016

We describe a new implementation of a two-level overlapping Schwarz preconditioner with energy-minimizing coarse space (GDSW: generalized Dryja--Smith--Widlund) and show numerical results for an…

Technical Note: Improving the computational efficiency of sparse matrix multiplication in linear atmospheric inverse problems

- Computer Science, Mathematics
- 2016

A hybrid-parallel sparse-sparse matrix multiplication approach that is more efficient by a third in terms of execution time and operation count relative to standard sparse matrix multiplication algorithms available in most libraries is presented.

Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD

- Computer ScienceJ. Comput. Appl. Math.
- 2018

High-Performance Sparse Matrix-Matrix Products on Intel KNL and Multicore Architectures

- Computer ScienceICPP Workshops
- 2018

A critical finding is that hash-table-based SpGEMM gets a significant performance boost if the nonzeros are not required to be sorted within each row of the output matrix.

## References

SHOWING 1-10 OF 31 REFERENCES

Parallel Sparse Matrix-Matrix Multiplication and Indexing: Implementation and Experiments

- Computer ScienceSIAM J. Sci. Comput.
- 2012

It is demonstrated that the parallel SpGEMM methods, which use two-dimensional block data distributions with serial hypersparse kernels, are indeed highly flexible, scalable, and memory-efficient in the general case.

Communication optimal parallel multiplication of sparse random matrices

- Computer ScienceSPAA
- 2013

Two new parallel algorithms are obtained and it is proved that they match the expected communication cost lower bound, and hence they are optimal.

Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication

- Computer ScienceSPAA
- 2015

This paper characterize the communication cost of a sparse matrix-matrix multiplication algorithm in terms of the size of a cut of an associated hypergraph that encodes the computation for a given input nonzero structure.

Parallel Smoothed Aggregation Multigrid : Aggregation Strategies on Massively Parallel Machines

- Computer ScienceACM/IEEE SC 2000 Conference (SC'00)
- 2000

This paper considers parallelization of the smoothe aggregation multigrid methods, and discusses three different parallel aggregation algorithms an illustrates the advantages an disadvantages of each variant in terms of parallelism an convergence.

A general parallel sparse-blocked matrix multiply for linear scaling SCF theory

- Computer Science
- 2000

ML 5.0 Smoothed Aggregation Users's Guide

- Computer Science
- 2006

This document describes one specific algebraic multigrid approach: smoothed aggregation, a multilevel and domain decomposition method for symmetric and nonsymmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems).

Sparse matrix multiplication: The distributed block-compressed sparse row library

- Computer ScienceParallel Comput.
- 2014

Sparse Matrix-Matrix Products Executed Through Coloring

- Computer ScienceSIAM J. Matrix Anal. Appl.
- 2015

This paper proposes a new algorithm for computing sparse matrix-matrix products by exploiting their nonzero structure through the process of graph coloring and proves its viability for examples including multigrid methods used to solve boundary value problems as well as matrix products appearing in unstructured applications.

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

- Computer ScienceACM Trans. Math. Softw.
- 2015

The implementation is fully general and the optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively.

Simultaneous Input and Output Matrix Partitioning for Outer-Product-Parallel Sparse Matrix-Matrix Multiplication

- Computer ScienceSIAM J. Sci. Comput.
- 2014

Three hypergraph models are proposed that achieve simultaneous partitioning of input and output matrices without any replication of input data for outer-product--parallel sparse matrix-matrix multiplication (SpGEMM).