Kapil K. Mathur

Learn More
A nite element method for computational uid dynamics has been implemented on the Connection Machine systems CM-2 and CM-200. An implicit iterative solution strategy , based on the preconditioned matrix-free GMRES algorithm, is employed. Parallel data structures built on both nodal and elemental sets are used to achieve maximum paral-lelization.(More)
Some level{2 and level{3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been implemented on the Connection Machine system CM{200 are described. For matrix{matrix multiplication, both the nonsystolic and the systolic algorithms are outlined. A systolic algorithm that computes the product matrix in{place is described in detail. All algorithms(More)
EEcient data motion is critical for high performance computing on distributed memory architectures. The value of some techniques for eecient data motion is illustrated by identifying generic communication primitives. Further, the eeciency of these primitives is demonstrated on three diier-ent applications using the nite element method for unstructured grids(More)
Detailed algorithms for all{to{all broadcast and reduction are given for arrays mapped by binary or binary{reeected Gray code encoding to the processing nodes of binary cube networks. Algorithms are also given for the local computation of the array indices for the communicated data, thereby reducing the demand for communications bandwidth. For the(More)
This paper demonstrates that scalability and competitive eeciency can be achieved for unstructured grid nite element applications on distributed memory machines, such as the Connection Machine CM-5 system. The eeciency of nite element solvers is analyzed through two applications: an implicit computational aerodynamics application and an explicit solid(More)