Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage

Abstract

For matrix multiplication on hypercube multiproces-sors with the product matrix accumulated in place a processor must receive about P 2 = p N elements of each input operand, with operands of size P P distributed evenly over N processors. With concurrent communication on all ports, the number of element transfers in sequence can be reduced to P 2 = p N logN… (More)

Topics

Cite this paper

@inproceedings{Ho1991MatrixMO, title={Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage}, author={Ching-Tien Ho and S. Lennart Johnsson and Alan Edelman}, year={1991} }