A set of level 3 basic linear algebra subprograms

@article{Dongarra1990ASO,
  title={A set of level 3 basic linear algebra subprograms},
  author={Jack J. Dongarra and Jeremy Du Croz and Sven Hammarling and Iain S. Duff},
  journal={ACM Trans. Math. Softw.},
  year={1990},
  volume={16},
  pages={1-17}
}
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations that should provide for efficient and portable implementations of algorithms for high-performance computers 
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms
TLDR
The PBLAS are targeted at distributed vector-vector, matrixvector and matrix-matrix operations with the aim of simplifying the parallelization of linear algebra codes, especially when implemented on top of the sequential BLAS. Expand
An overview of the sparse basic linear algebra subprograms
We discuss the interface design for the Sparse Basic Linear Algebra Subprograms (BLAS), the kernels in the recent standard from the BLAS Technical Forum that are concerned with unstructured sparse ...
Multiplication of Matrices of Arbitrary Shape on a Data Parallel Computer
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been implemented on the Connection Machine system CM-200 are described. No assumption is made on the shape orExpand
Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs
TLDR
A model implementation and test software for the Level 2 Basic Linear Algebra Subprograms, targeted at matrix-vector operations, provide a portable set of FORTRAN 77 Level 2 BLAS for machines where specialized implementations do not exists or are not required. Expand
A compiler for linear algebra operations
TLDR
A compiler that translates arithmetic expressions containing matrices to efficient sequences of calls to basic linear algebra kernels is presented. Expand
Basic Linear Operators in Fortran 90
Standard F90 lets a programmer define operators, corresponding to mathematical operations, that apply to arrays or more complicated classes of objects: The matrix-vector and matrix-matrix productsExpand
The Multicomputer Toolbox Approach to Concurrent Blas
Concurrent Basic Linear Algebra Subprograms (CBLAS) are a sensible approach to extending the successful Basic Linear Algebra Subprograms (BLAS) to multicomput-ers. We describe many of the issuesExpand
Linear Algebra Subprograms on Shared Memory Computers
TLDR
New linear algebra routines for LU, Cholesky and QR factorizations and for the solution of dense symmetric eigenproblems aimed at shared memory, cache-based, systems are introduced. Expand
A Proposed API for Batched Basic Linear Algebra Subprograms
TLDR
This paper proposes an API for Batched Basic Linear Algebra Subprograms (Batched BLAS), with the aim of providing more efficient, but portable, implementations of algorithms on high-performance manycore architectures. Expand
LAPACK: A Linear Algebra Library for High-Performance Computers
This talk outlines the computational package called LAPACK. LAPACK is a collection of Fortran 77 subroutines for the analysis and solution of various systems of simultaneous linear algebraicExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 53 REFERENCES
An extended set of FORTRAN basic linear algebra subprograms
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations that should provide for efficient and portableExpand
Sparse extensions to the FORTRAN Basic Linear Algebra Subprograms
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extension is targeted at sparse vector operations, with the goal of providing efficient, but portable,Expand
A proposal for a set of level 3 basic linear algebra subprograms
TLDR
The Level 3 BLAS are targeted at matrix-matrix operations with the aim of providing more efficient, but portable, implementations of algorithms on high-performance computers, especially those with hierarchical memory and parallel processing capability. Expand
Issues relating to extension of the Basic Linear Algebra Subprograms
TLDR
The authors' views are Lased on their implementation of theBLAS in highly-efficient assembly language for the Cray-1 and Cray X-MP, and use of the BLAS and some of the proposed extensions in a variety of application areas. Expand
Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs
This paper describes a model implementation and test software for the Level 2 Basic Linear Algebra Subprograms (Level 2 BLAS). Level 2 BLAS are targeted at matrix-vector operations with the aim ofExpand
Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs
TLDR
A model implementation and test software for the Level 2 Basic Linear Algebra Subprograms, targeted at matrix-vector operations, provide a portable set of FORTRAN 77 Level 2 BLAS for machines where specialized implementations do not exists or are not required. Expand
Basic Linear Algebra Subprograms for Fortran Usage
TLDR
A package of 38 low level subprograms for many of the basic operations of numerical linear algebra is presented, intended to be used with FORTRAN. Expand
The use of BLAS3 in linear algebra on a parallel processor with a hierarchical memory
This note describes work at CSRD which shows that a third level of the BLAS (BLAS3) is needed to achieve high-performance on multivector processors with a shared hierarchical memory.
Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine
This paper examines common implementations of linear algebra algorithms, such as matrix-vector multiplication, matrix-matrix multiplication and the solution of linear equations. The differentExpand
Linear algebra programs for use on a vector computer with a secondary solid state storage device
TLDR
Performance data are presented which demonstrate that the speed of the routines approaches that of routines with all data in main memory and is close to the maximum speed ofThe processor. Expand
...
1
2
3
4
5
...