We revisit and use the dependence transformation method to generate parallel algorithms suitable for cluster and grid computing. We illustrate this method in two applications: to obtain a systolic matrix product algorithm, and to compute the alignment score of two strings. The product of two n × n matrices is viewed as multiplying two p × p matrices whose elements are n/p × n/p submatrices. For m such multiplications, using p processors, the proposed parallel solution gives a linear speedup of mp 3 (m+2)p−2 or roughly p. The alignment problem of two strings of lengths m and n is solved in O(p) communication rounds and O(mn/p) local computing time. We show promising experimental results obtained on a 16-node Beowulf cluster and on an 18-node grid called InteGrade, consisting of desktop computers.

