• Publications
  • Influence
Multi-dimensional Rankings, Program Termination, and Complexity Bounds of Flowchart Programs
TLDR
We propose an efficient algorithm to compute ranking functions: It can handle flowcharts of arbitrary structure. Expand
  • 149
  • 23
  • PDF
Scheduling and Automatic Parallelization
I Unidimensional Problems.- 1 Scheduling DAGs without Communications.- 2 Scheduling DAGs with Communications.- 3 Cyclic Scheduling.- II Multidimensional Problems.- 4 Systems of Uniform RecurrenceExpand
  • 193
  • 19
Lattice-based memory allocation
TLDR
We investigate the problem of memory reuse in order to reduce the memory needed to store an array variable. Expand
  • 123
  • 15
  • PDF
On the Complexity of Loop Fusion
  • A. Darte
  • Computer Science
  • Parallel Comput.
  • 1 August 2000
TLDR
Loop fusion is a program transformation that combines several loops into one. Expand
  • 134
  • 9
Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation
TLDR
In this paper, we survey loop parallelization algorithms, analyzing the dependence representations they use, the loop transformations they generate, the code generation schemes they require, and their ability to incorporate various optimizing criteria such as maximal parallelism detection, permutable loop detection, minimization of synchronizations, easiness of code generation, etc. Expand
  • 100
  • 9
  • PDF
Circuit Retiming Applied to Decomposed Software Pipelining
TLDR
This paper elaborates on a new view on software pipeling, called decomposed software pipelining, which brings a new insight into the software problem by establishing its deep link with the circuit retiming problem. Expand
  • 51
  • 7
Bee+Cl@k: an implementation of lattice-based array contraction in the source-to-source translator rose
TLDR
We build on prior work on intra-array memory reuse, for which a general theoretical framework was proposed based on lattice theory, and propose a new algorithm for lattice-based memory reuse. Expand
  • 35
  • 7
  • PDF
(Pen)-ultimate tiling?
TLDR
In the framework of perfect loop nests with uniform dependences, tiling is a technique used to group elemental computation points so as to increase computation granularity and to reduce the overhead due to communication time. Expand
  • 155
  • 5
Mapping Uniform Loop Nests Onto Distributed Memory Architectures
TLDR
We use affine-by-statement scheduling, mapping and partitioning techniques for uniform loop nests to synthesize a virtual grid architecture from the original loop nest. Expand
  • 68
  • 5
Optimizing remote accesses for offloaded kernels: Application to high-level synthesis for FPGA
TLDR
We show how to automatically generate the sets of data to be read from (resp. written to) the external memory just before (resp.) each tile so as to reduce communications and reuse data as much as possible in the accelerator. Expand
  • 31
  • 5
  • PDF
...
1
2
3
4
5
...