Polly's Polyhedral Scheduling in the Presence of Reductions
@article{Doerfert2015PollysPS, title={Polly's Polyhedral Scheduling in the Presence of Reductions}, author={Johannes Doerfert and Kevin Streit and Sebastian Hack and Zino Benaissa}, journal={ArXiv}, year={2015}, volume={abs/1505.07716} }
The polyhedral model provides a powerful mathematical abstraction to enable eective optimization of loop nests with respect to a given optimization goal, e.g., exploiting parallelism. Unexploited reduction properties are a frequent reason for polyhedral optimizers to assume parallelism prohibiting dependences. To our knowledge, no polyhedral loop optimizer available in any production compiler provides support for reductions. In this paper, we show that leveraging the parallelism of reductions…
23 Citations
Polyhedral expression propagation
- Computer ScienceCC
- 2018
This paper presents a technique that statically propagates expressions in order to avoid communicating their result via memory, which outperforms a state-of-the-art polyhedral optimization especially designed for this kind of programs by a factor of up to 2.03×.
Polygeist: Raising C to Polyhedral MLIR
- Computer Science2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)
- 2021
The Polygeist/MLIR intermediate representation featuring high-level (affine) loop constructs and n-D arrays embedded into a single static assignment (SSA) substrate enables an unprecedented combination of SSA-based and polyhedral optimizations.
Reduction drawing: Language constructs and polyhedral compilation for reductions on GPUs
- Computer Science2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)
- 2016
This work presents language constructs that let a programmer express arbitrary reductions on user-defined data types matching the performance of tuned library implementations, and extends a polyhedral compilation flow to process these user- defined reductions.
Simplifying Multiple-Statement Reductions with the Polyhedral Model
- Computer ScienceArXiv
- 2020
This work identifies and formalizes the multiple\-/statement reduction problem as a bilinear optimization problem, and presents a heuristic optimization algorithm for these reductions, and demonstrates that the algorithm provides optimal complexity for a set of benchmark programs from the literature on probabilistic inference algorithms, whose performance critically relies on simplifying these reductions.
Scheduling and Tiling Reductions on Realistic Machines
- Computer ScienceArXiv
- 2018
The techniques presented in Gupta et al., identify a potential issue in their scheduling algorithm and provide a solution, and demonstrate how these scheduling techniques can be extended to "tile" reductions and briefly survey other studies that address the problem of scheduling reductions.
md_poly: A Performance-Portable Polyhedral Compiler Based on Multi-Dimensional Homomorphisms
- Computer Science
- 2020
1 Motivation Programming state-of-the-art parallel architectures such as multi-core CPU and many-core GPU is challenging. For high performance, the programmer has to optimize its source code for the…
Discovery and exploitation of general reductions: A constraint based approach
- Computer Science2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
- 2017
A new compiler based approach that automatically detects a wide class of reductions and exploitation of histograms is developed, based on a constraint formulation of the reduction idiom and has been implemented as an LLVM pass.
Application-Specific Arithmetic in High-Level Synthesis Tools
- Computer ScienceACM Trans. Archit. Code Optim.
- 2020
This work studies hardware-specific optimization opportunities currently unexploited by high-level synthesis compilers and prototyped in the GeCoS source-to-source compiler and evaluated on the Polybench and EEMBC benchmark suites.
An Interval Compiler for Sound Floating-Point Computations
- Computer Science2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
- 2021
IGen, a source-to-source compiler that translates a given C function using floating-point into an equivalent sound C function that uses interval arithmetic is presented, showing that the generated code delivers sound double precision results at high performance.
Automatically harnessing sparse acceleration
- Computer ScienceCC
- 2020
A new approach based on the LiLAC specification, requiring the application developer to (re)write every program for a given library, the burden is shifted to a one-off description by the library implementer.
References
SHOWING 1-10 OF 38 REFERENCES
Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation
- Computer ScienceParallel Process. Lett.
- 2012
Polly is presented, an infrastructure for polyhedral optimizations on the compiler's internal, low-level, intermediate representation (IR) and an interface for connecting external optimizers and a novel way of using the parallelism they introduce to generate SIMD and OpenMP code is presented.
The Polyhedral Model Is More Widely Applicable Than You Think
- Computer ScienceCC
- 2010
This work concentrates on extending the code generation step and does not compromise the expressiveness of the model, presenting experimental evidence that the extension is relevant for program optimization and parallelization, showing performance improvements on benchmarks that were thought to be out of reach of the polyhedral model.
Scheduling reductions
- Computer ScienceICS '94
- 1994
A scheduling method based on the algorithms from [Fea92a, Fea92b] which works in presence of reductions is presented and it is shown that side-effects of reductions scheduling are the simplification of the scheduling process and the improvement of the computed schedules.
Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model
- Computer ScienceCC
- 2008
This work proposes an automatic transformation framework to optimize arbitrarily-nested loop sequences with affine dependences for parallelism and locality simultaneously and finds good tiling hyperplanes by embedding a powerful and versatile cost function into an Integer Linear Programming formulation.
When polyhedral transformations meet SIMD code generation
- Computer SciencePLDI 2013
- 2013
This work defines the concept of vectorizable codelets, with properties tailored to achieve effective SIMD code generation for the codelet, and uses the power of a modern high-level transformation framework to restructure a program to expose good ISA-independent vectorizable codes, exploiting multi-dimensional data reuse.
Scan detection and parallelization in "inherently sequential" nested loop programs
- Computer ScienceCGO '12
- 2012
This work presents a method for automatically parallelizing "inherently sequential" programs, which handles arbitrarily nested loops, and identifies situations where the computation performed by the loop body is equivalent to a matrix vector product over a semi-ring.
Scheduling reductions on realistic machines
- Computer ScienceSPAA '02
- 2002
An algorithm is developed to determine efficient serializations of all reductions in systems of affine recurrence equations over polyhedral domains.
Static analysis of upper and lower bounds on dependences and parallelism
- Computer ScienceTOPL
- 1994
A two-step approach to the search for parallelism in sequential programs is proposed, which lets us distinguish inherently sequential code from code that contains unexploited parallelism and produces information about the kinds of transformations needed to parallelize the code, without worrying about the order of application of the transformations.
isl: An Integer Set Library for the Polyhedral Model
- ArtICMS
- 2010
In compiler research, polytopes and related mathematical objects have been successfully used for several decades to represent and manipulate computer programs in an approach that has become known as…
A framework for enhancing data reuse via associative reordering
- Computer SciencePLDI 2014
- 2014
It is shown how stencil operations can be implemented to better exploit register reuse and reduce load/stores and a multi-dimensional retiming formalism is developed to characterize the space of valid implementations in conjunction with other program transformations.