A Proposal for a Set of Parallel Basic Linear Algebra Subprograms

@inproceedings{Choi1995APF,
  title={A Proposal for a Set of Parallel Basic Linear Algebra Subprograms},
  author={Jaeyoung Choi and Jack J. Dongarra and Susan Ostrouchov and Antoine Petitet and David W. Walker and R. Clinton Whaley},
  booktitle={PARA},
  year={1995}
}
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms (PBLAS) for distributed memory MIMD computers. The PBLAS are targeted at distributed vector-vector, matrixvector and matrix-matrix operations with the aim of simplifying the parallelization of linear algebra codes, especially when implemented on top of the sequential BLAS. 
ScaLAPACK: A Linear Algebra Library for Message-Passing Computers
TLDR
The content and performance of some of the ScaLAPACK software are outlined and initial performance results for selected PBLAS routines and a subset of Sca LAPACK driver routines are provided.
A Parallel Implementation of a Potential Reduction Algorithm for Box-Constrained Quadratic Programming
TLDR
A parallel version of the potential reduction algorithm for MIMD distributed memory machines is described, in which the computational kernels arising at each step of the algorithm are concurrently performed by using standard parallel software environments.
Parallel implementation of BLAS: general techniques for Level 3 BLAS
TLDR
It is shown that the techniques used for the matrix-matrix multiplication naturally extend to all important level 3 BLAS and thus this approach becomes an enabling technology for efficient parallel implementation of these routines and libraries that use BLAS.
A Proposal for Parallel Sparse BLAS
In this paper we propose a set of parallel interfaces that extends the sparse BLAS presented in [8] to distributed memory parallel machines with message passing programming interfaces. Our main
A Portable Implementation of ARPACKfor Distributed Memory
TLDR
A parallel implementation of ARPACK is presented which is portable across a wide range of distributed memory platforms and requires minimal changes to the serial code.
P_ARPACK: An Efficient Portable Large Scale Eigenvalue Package for Distributed Memory Parallel Architectures
TLDR
P_ARPACK is a parallel version of the ARPACK software which is portable across a wide range of distributed memory platforms and requires minimal changes to the serial code.
Solving PDE Problems on Parallel and Distributed Computer Systems Using the NAG Parallel Library
TLDR
The NAG Parallel Library provides parallel subroutines in some of the areas covered by traditional numerical libraries, such as dense and sparse linear algebra, optimization, quadrature and random number generation, as well as utility routines for data distribution, input/output and process management.
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance
TLDR
The content and performance of ScaLAPACK, a collection of mathematical software for linear algebra computations on distributed memory computers, are outlined and alternative approaches to mathematical libraries are suggested, explaining how Sca LAPACK could be integrated into efficient and user-friendly distributed systems.
A Parallel Numerical Library for UPC
TLDR
The design and implementation of a parallel numerical library for UPC built on top of the sequential BLAS routines is described, taking into account data locality in order to guarantee a good performance.
...
...

References

SHOWING 1-10 OF 38 REFERENCES
A set of level 3 basic linear algebra subprograms
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations that should provide for efficient and portable
A proposal for standard linear algebra subprograms
A set of FORTRAN callable subprograms are proposed which will be useful in the development of efficient portable ANSI FORTRAN subprograms and applications programs in the area of linear algebra.
The Multicomputer Toolbox Approach to Concurrent Blas
TLDR
Algorithms for dense matrix-vector and matrix-matrix multiplication on general P Q logical process grids are presented, and experiments run demonstrating their performance characteristics.
Basic Linear Algebra Subprograms for Fortran Usage
TLDR
A package of 38 low level subprograms for many of the basic operations of numerical linear algebra is presented, intended to be used with FORTRAN.
Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs
This paper describes a model implementation and test software for the Level 2 Basic Linear Algebra Subprograms (Level 2 BLAS). Level 2 BLAS are targeted at matrix-vector operations with the aim of
ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers
The authors describe ScaLAPACK, a distributed memory version of the LAPACK software package for dense and banded matrix computations. Key design features are the use of distributed versions of the
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance
TLDR
The content and performance of ScaLAPACK, a collection of mathematical software for linear algebra computations on distributed memory computers, are outlined and alternative approaches to mathematical libraries are suggested, explaining how Sca LAPACK could be integrated into efficient and user-friendly distributed systems.
The Multicomputer Toolbox approach to concurrent BLAS and LACS
TLDR
There is limited leverage in LACS per se as a stand-alone message-passing standard, and it is proposed that needed capabilities instead be integrated in a general, application-level message passing standard, focusing attention on CBLAS and large-scale application needs.
LAPACK Working Note 73: Basic Linear Algebra Communication Subprograms: Analysis and Implementation Across Multiple Parallel Architectures
TLDR
The BLACS are used as the communication layer for the ScaLAPACK project, which involves implementing the LAPACK library on distributed memory MIMD machines.
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0
The BLACS (Basic Linear Algebra Communication Subprograms) project is an ongoing investigation whose purpose is to create a linear algebra oriented message passing interface that is implemented e
...
...