Learn More
This paper presents an algorithm to find the optimal affine partitions that maximize the degree of parallelism and minimize the degree of synchronization in programs with arbitrary loop nestings and affine data accesses. The problem is formulated without the use of imprecise data dependence abstractions such as data dependence vectors. The algorithm(More)
An affine partitioning Framework unifies many useful program transforms such as unimodular transformations (interchange, reversal, skewing), loop fusion, fission, scaling, reindexing, and statement reordering. This paper presents an algorithm, based on this unified framework, that maximizes parallelism while minimizing communication in programs with(More)
This paper presents the first algorithm to find the optimal affine transform that maximizes the degree of parallelism while minimizing the degree of synchronization in a program with arbitrary loop nestings and affine data accesses. The problem is formulated without the use of imprecise data dependence abstractions such as data dependence vectors. The(More)
Applicable to arbitrary sequences and nests of loops, affine partitioning is a program transformation framework that unifies many previously proposed loop transformations, including unimodular transforms, fusion, fission, reindexing, scaling and statement reordering. Algorithms based on affine partitioning have been shown to be effective for(More)
This paper presents an overview of a parallelizing compiler to automatically generate eecient code for large-scale parallel architec-tures from sequential input programs. This research focuses on loop-level parallelism in dense matrix computations. We illustrate the basic techniques the compiler uses by describing the entire compilation process for a simple(More)
  • 1