Learn More
The <italic>polyhedral model</italic> provides a single unified foundation for systolic array synthesis and automatic parallelization of loop programs. We investigate the problem of memory reuse when compiling Alpha (a functional language based on this model). Direct compilation would require unacceptably large memory (for example(More)
Automatic parallelization in the polyhedral model is based on aane transformations from an original computation domain (iteration space) to a target space-time domain, often with a diierent transformation for each variable. Code generation is an often ignored step in this process that has a signiicant impact on the quality of the nal code. It involves(More)
CUDASW++ is a parallelization of the Smith-Waterman algorithm for CUDA graphical processing units that computes the similarity scores of a query sequence paired with each sequence in a database. The algorithm uses one of two kernel functions to compute the score between a given pair of sequences: the inter-task kernel or the intra-task kernel. We have(More)
For 2-D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time under the BSP model. We consider uniform dependency computations, tiled so that (at least) one of the tile boundaries is parallel to the domain boundary. We determine the optimal tile size as a closed form solution. In addition,(More)
As we move towards exa-scale computing, energy is becoming increasingly important, even in the high performance computing arena. However, the simple equation, Energy = Power × Time, suggests that optimizing for speed already optimizes for energy, under the assumption that Power is constant. When power is not constant, a strategy that achieves energy savings(More)