François Irigoin

Learn More
Supercompilers perform complex program transformations which often result in new loop bounds. This paper shows that, under the usual assumptions in automatic parallelization, most transformations on loop nests can be expressed as affine transformations on integer sets defined by polyhedra and that the new loop bounds can be computed with algorithms using(More)
Supercompilers must reschedule computations defined by nested DO-loops in order to make an efficient use of supercomputer features (vector units, multiple elementary processors, cache memory, etc…). Many rescheduling techniques like loop interchange, loop strip-mining or rectangular partitioning have been described to speedup program execution. We(More)
PIPS is an experhnental FORTRAN source-to-source parallelizer that combines the goal of exploring interprocedural and semantical analysis with a requirement for compilation speed. We present in this paper the main features of PIPS, i.e., demand-driven architecture, automatic support for multiple implementation languages, structured control graph, predicates(More)
Asynchronous CALL statements are necessary in order to use more than one processor in current multiprocessors. Detecting CALL statements that may be executed in parallel is one way to fill this need. This approach requires accurate approximations of called procedure effects. This is achieved by using new objects called <italic>Region</italic> and(More)
Many program optimizations require exact knowledge of the sets of array elements that are referenced in or that flow between statements or procedures. Some examples are array privatization, generation of communications in distributed memory machines, or compile-time optimization of cache behavior in hierarchical memory machines. Exact array region analysis(More)
Modular static analyzers use procedure abstractions, a.k.a. summarizations, to ensure that their execution time increases linearly with the size of analyzed programs. A similar abstraction mechanism is also used within a procedure to perform a bottom-up analysis. For instance, a sequence of instructions is abstracted by combining the abstractions of its(More)
We present an automatic, static program transformation that schedules and generates e cient memory transfers between a computer host and its hardware accelerator, addressing a well-known performance bottleneck. Our automatic approach uses two simple heuristics: to perform transfers to the accelerator as early as possible and to delay transfers back from the(More)
Parallel and heterogeneous computing are growing in audience thanks to the increased performance brought by ubiquitous manycores and GPUs. However, available programming models, like OPENCL or CUDA, are far from being straightforward to use. As a consequence, several automated or semi-automated approaches have been proposed to automatically generate(More)
Many abstractions of program dependences have already been proposed such as the Dependence Distance the Dependence Di rection Vector the Dependence Level or the Dependence Cone These di erent abstractions have di erent precision The min imal abstraction associated to a transformation is the abstrac tion that contains the minimal amount of information(More)