Parallel Triangular Sylvester-Type Matrix Equation Solvers for SMP Systems Using Recursive Blocking

@inproceedings{Jonsson2000ParallelTS,
  title={Parallel Triangular Sylvester-Type Matrix Equation Solvers for SMP Systems Using Recursive Blocking},
  author={Isak Jonsson and Bo K{\aa}gstr{\"o}m},
  booktitle={PARA},
  year={2000}
}
We present recursive blocked algorithms for solving triangular Sylvester-type matrix equations. Recursion leads to automatic blocking that is variable and "squarish". The main part of the computations are performed as level 3 general matrix multiply and add (GEMM) operations. We also present new highly optimized superscalar kernels for solving small-sized matrix equations stored in level 1 cache. Hereby, a larger part of the total execution time will be spent in GEMM operations. In turn, this… 

Recursive blocked algorithms for solving triangular systems—Part I: one-sided and coupled Sylvester-type matrix equations

Novel recursive blocked algorithms for solving one-sided triangular matrix equations, including the continuous-time Sylvester and Lyapunov equations, and a generalized coupled SylvesTER equation are presented.

Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software

Some of the recent advances made by applying the paradigm of recursion to dense matrix computations on today's memory-tiered computer systems are reviewed and details.

A Faster and Simpler Recursive Algorithm for the LAPACK Routine DGELS

The four different problems of DGELS are essentially reduced to two, by use of explicit transposition of A, and by avoiding redundant computations in the update of B the authors reduce the work needed to compute the minimum norm solution.

A NEW MUCH FASTER AND SIMPLERALGORITHM FOR LAPACK

The four diierent problems of DGELS are essentially reduced to two, by use of explicit transposition of A, and by avoiding redundant computations in the update of B the authors reduce the work needed to compute the minimum norm solution.

Recursive Blocked Algorithms, Data Structures, and High-Performance Software for Solving Linear Systems and Matrix Equations

This thesis deals with the development of efficient and reliable algorithms and library software for factorizing matrices and solving matrix equations on high-performance computer systems. The arch

References

SHOWING 1-10 OF 19 REFERENCES

New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems

A hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by 78% to 21% as m=n increases from 100 to 1000 and an automatic variable blocking that allow us to replace a level 2 part in a standard block algorithm by level 3 operations.

Parallel Algorithms for Triangular Sylvester Equations: Design, Scheduling and Saclability Issues

A new scalable algorithm for solving (quasi)triangular Sylvester equations on a logical 2D-toroidal processor grid is presented and a static scheduler chosing the optimal processor grid and block sizes for a rectangular block scatter (RBS) mapping of matrices is incorporated in the algorithm.

Applying recursion to serial and parallel QR factorization leads to better performance

A hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by about 20% for large square matrices and up to almost a factor of 3 for tall thin matrices is introduced.

LAPACK-style algorithms and software for solving the generalized Sylvester equation and estimating the separation between regular matrix pairs

This work reorganizes the standard algorithm for this problem to use Level 3 BLAS operations, like matrix multiplication, in its inner loop, and develops and compares several condition estimation algorithms, which inexpensively but accurately estimate the sensitivity of the solution of this linear system.

Distributed and Shared Memory Block Algorithms for the Triangular Sylvester Equation with øperatornamesep - 1 Estimators

Coarse grain message passing and shared memory algorithms for solving the quasi-triangular Sylvester equation are discussed and estimators based on the Frobenius norm and the 1-norm, respectively are presented.

Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms

It is argued that, out of the NB factorial choices of ordering the NB blocks, the recursive ordering leads to one of the best, because the authors' algorithms are also recursive and will do their computations on submatrices that follow the new recursive data structure definition.

Recursion leads to automatic variable blocking for dense linear-algebra algorithms

An analysis of the recursive algorithm in terms of both FLOP count and storage usage is presented, and it is demonstrated that the recursive version outperforms the level-3 LAPACK versions of DGETRF and DPOTRF on an RS/6000 workstation.

GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark

This work states that it is possible to develop a portable and high-performance level 3 BLAS library mainly relying on a highly optimized GEMM, the routine for the general matrix multiply and add operation.

Solution of the Sylvester matrix equation AXBT + CXDT = E

A transformation method is used which employs the QZ algorithm to structure the Sylvester-type matrix equation in such a way that it can be solved columnwise by a back substitution technique.

Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form

A two-stage blocked algorithm for reduction of a regular matrix pair (<italic>A , B </italic>) to upper Hessenberg-triangular form is presented and a blocked variant of the single-diagonal double-shift QZ method for computing the generalized Schur form of (<itali>A, B</italic>, which outperforms the current LAPACK routines by a factor 2-5 for sufficiently large problems.