In this paper we report on the development of an efficient and portable implementation of Strassen's matrix multiplication algorithm for matrices of arbitrary size. Our technique for defining theâ€¦ (More)

In this paper, we show how the intimate relationship between a certain matrix algebra and the ring of complex polynomials suggests a strategy for devising new parallel algorithms for the eigenproblemâ€¦ (More)

This paper compares two general library routines for performing parallel distributed matrix multiplication. The PUMMA algorithm utilizes block scattered data layout, whereas BiMMeR utilizes virtualâ€¦ (More)

In this paper we report on the development of an e cient and portable implementation of Strassen's matrix multiplication algorithm for matrices of arbitrary size. Our implementation is designed to beâ€¦ (More)

Matrix multiplication is a key primitive in block matrix algorithms such as those found in LAPACK. We present results from our study of matrix multiplication algorithms on the Intel Touchstone Delta,â€¦ (More)

We present an overview of the banded Invariant Subspace Decomposition Algorithm for symmetric matrices and describe a parallel implementation of this algorithm. The algorithm described here is aâ€¦ (More)

In this paper, preliminary research results on a new algorithm for finding all the eigenvalues and eigenvectors of a real diagonalizable matrix with real eigenvalues are presented. The basicâ€¦ (More)

We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by rst describing the algorithm, followed by a discussion of a parallel implementation ofâ€¦ (More)

In this paper, we discuss the performance achieved by several implementations of the recently deened Message Passing Interface (MPI) standard. In particular, performance results for diierentâ€¦ (More)