• Publications
  • Influence
A set of level 3 basic linear algebra subprograms
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations that should provide for efficient and portableExpand
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation
TLDR
Open MPI provides a unique combination of novel features previously unavailable in an open-source, production-quality implementation of MPI, which provides both a stable platform for third-party research as well as enabling the run-time composition of independent software add-ons. Expand
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
TLDR
The PVM system, a heterogeneous network computing trends in distributed computing PVM overview other packages, and troubleshooting: geting PVM installed getting PVM running compiling applications running applications debugging and tracing debugging the system. Expand
LAPACK Users' Guide, 3rd ed.
Automated empirical optimizations of software and the ATLAS project
TLDR
This paper describes the ATLAS (Automatically Tuned Linear Algebra Software) project, as well as the fundamental principles that underly it, with the present emphasis on the Basic Linear Al algebra Subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library. Expand
Automatically Tuned Linear Algebra Software
TLDR
An approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units using the widely used linear algebra kernels called the Basic Linear Algebra Subroutines (BLAS). Expand
Templates for the Solution of Algebraic Eigenvalue Problems
TLDR
This book discusses iterative projection methods for solving Eigenproblems, and some of the techniques used to solve these problems came from the literature on Hermitian Eigenvalue. Expand
Performance of various computers using standard linear equations software
TLDR
This report compares the performance of different computer systems in solving dense systems of linear equations, ranging from a CRAY Y-MP to scientific workstations such as the Apollo and Sun to IBM PCs. Expand
A Portable Programming Interface for Performance Evaluation on Modern Processors
The purpose of the PAPI project is to specify a standard application programming interface (API) for accessing hardware performance counters available on most modern microprocessors. These countersExpand
...
1
2
3
4
5
...