Learn More
Supercompilers must reschedule computations defined by nested DO-loops in order to make an efficient use of supercomputer features (vector units, multiple elementary processors, cache memory, etc…). Many rescheduling techniques like loop interchange, loop strip-mining or rectangular partitioning have been described to speedup program execution. We(More)
Supercompilers perform complex program transformations which often result in new loop bounds. This paper shows that, under the usual assumptions in automatic parallelization, most transformations on loop nests can be expressed as affine transformations on integer sets defined by polyhedra and that the new loop bounds can be computed with algorithms using(More)
PIPS is an experhnental FORTRAN source-to-source par-allelizer that combines the goal of exploring interprocedu-ral and semantical analysis with a requirement for compilation speed. We present in this paper the main features of PIPS, i.e., demand-driven architecture, automatic support for multiple implementation languages, structured control graph,(More)
Modular static analyzers use procedure abstractions, a.k.a. summarizations, to ensure that their execution time increases linearly with the size of analyzed programs. A similar abstraction mechanism is also used within a procedure to perform a bottom-up analysis. For instance, a sequence of instructions is abstracted by combining the abstractions of its(More)
We present an automatic, static program transformation that schedules and generates ecient memory transfers between a computer host and its hardware accelerator, addressing a well-known performance bottleneck. Our automatic approach uses two simple heuris-tics: to perform transfers to the accelerator as early as possible and to delay transfers back from the(More)
Asynchronous CALL statements are necessary in order to use more than one processor in current multiprocessors. Detecting CALL statements that may be executed in parallel is one way to fill this need. This approach requires accurate approximations of called procedure effects. This is achieved by using new objects called <italic>Region</italic> and(More)
Many program optimizations require exact knowledge of the sets of array elements that are referenced in or that flow between statements or procedures. Some examples are array privatization, generation of communications in distributed memory machines, or compile-time optimization of cache behavior in hierarchical memory machines. Exact array region analysis(More)
Programming parallel machines as effectively as sequential ones would ideally require a language that provides high-level programming constructs to avoid the programming errors frequent when expressing parallelism. Since task parallelism is considered more error-prone than data parallelism, we survey six popular and efficient parallel language designs that(More)
Many abstractions of program dependences have already been proposed, such as the Dependence Distance, the Dependence Direction Vector, the Dependence Level or the Dependence Cone. These diierent abstractions have diierent precision. The minimal abstraction associated to a transformation is the abstraction that contains the minimal amount of information(More)