Weijia Shang

Learn More
Most existing methods of mapping algorithms into processor arrays are restricted to the case where n-dimensional algorithms, or algorithms with 71 nested loops, are mapped into ( n 1)-dimensional arrays. However, in practice, it is interesting to map n-dimensional algorithms into ( k 1)-dimensional arrays where I; < 1 1 . For example, many algorithms at bit(More)
With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses how to nd an optimal supernode size and optimal supernode relative side lengths of a supernode transformation (also known as tiling). We identify three parameters of supernode transformation: supernode size,(More)
With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses the selection of an optimal supernode shape of a supernode transformation (also known as tiling). We assume that the communication cost is dominated by the startup penalty and therefore, can be approximated by a(More)