Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms

  title={Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms},
  author={Ramesh C. Agarwal and Fred G. Gustavson and Mohammad Zubair},
  journal={IBM Journal of Research and Development},
We describe the algorithms and architecture approach to produce high-performance codes for numerically intensive computations. In this approach, for a given computation, we design algorithms so that they perform optimally when run on a target machine-in this case, the new POWERS'" machines from the RSl6000 family of RISC processors. The algorithmic features that we emphasize are functional parallelism, cachelregister blocking, algorithmic prefetching, loop unrolling, and algorithmic… CONTINUE READING


Publications citing this paper.
Showing 1-10 of 49 extracted citations

Automatic generation of fast BLAS3-GEMM: A portable compiler approach

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) • 2017
View 1 Excerpt

Anatomy of High-Performance Many-Threaded Matrix Multiplication

2014 IEEE 28th International Parallel and Distributed Processing Symposium • 2014
View 1 Excerpt

Similar Papers

Loading similar papers…