Benoît Meister

Learn More
DARPA's Ubiquitous High-Performance Computing (UHPC) program asked researchers to develop computing systems capable of achieving energy efficiencies of 50 GOPS/Watt, assuming 2018-era fabrication technologies. This paper describes Runnemede, the research architecture developed by the Intel-led UHPC team. Runnemede is being developed through a co-design(More)
Programmers for GPGPU face rapidly changing substrate of programming abstractions, execution models, and hardware implementations. It has been established, through numerous demonstrations for particular conjunctions of application kernel, programming languages, and GPU hardware instance, that it is possible to achieve significant improvements in the(More)
We describe a novel loop nest scheduling strategy implemented in the R-Stream compiler : the first scheduling formulation to jointly optimize a trade-off between parallelism, locality, contiguity of array accesses and data layout permutations in a single complete formulation. Our search space contains the maximal amount of vectorization in the program and(More)
A significant source for enhancing application performance and for reducing power consumption in embedded processor applications is to improve the usage of the memory hierarchy. In this paper, a temporal and spatial locality optimization framework of nested loops is proposed, driven by parameterized cost functions. The considered loops can be imperfectly(More)
One of the most efficient ways to improve program performances onto nowadays computers is to optimize the way cache memories are used. In particular, many scientific applications contain loop nests that operate on large multi-dimensional arrays whose sizes are often parameterized. No special attention is paid to cache memory performance when such loops are(More)
The polyhedral model is a well-known compiler optimization framework for the analysis and transformation of affine loop nests. We present a new method to solve a difficult geometric operation that is raised by this model: the integer affine transformation of parametric ℤ-polytopes. The result of such a transformation is given by a worst-case(More)
The Open Community Runtime (OCR) is a new runtime system designed to meet the needs of extreme-scale computing. While there is growing support for the idea that future execution models will be based on dynamic tasks, there is little agreement on what else should be included. OCR minimally adds events for synchronization and relocatable data-blocks for data(More)
This paper presents a new method for computing the integer hull of a parameterized rational polyhedron by introducing the concept of periodic polyhedron. Besides concerning generally parametric combinatorial optimization, the method has many applications for the analysis, optimization and parallelization of loop nests, especially in compilers. 1 Motivation(More)
For applications that deal with large amounts of high dimensional multi-aspect data, it becomes natural to represent such data as tensors or multi-way arrays. Multi-linear algebraic computations such as tensor decompositions are performed for summarization and analysis of such data. Their use in real-world applications can span across domains such as signal(More)