Automatic Analysis of Loops to

Abstract

Conngurable Arithmetic Logic Units (ALUs) ooer opportunities for adapting the underlying hardware to the computation for ee-ciency. The problem of identifying the optimal conngurations at diierent steps in a program is a very complex issue but allows the power of these ALUs to be maximally used if solved. This paper focuses on developing an automatic compilation framework for exploiting operator parallelism within loop nests. The focus of this analysis is on identifying and maximally using conngurations to avoid costly reconnguration overheads. In our framework, initially some operator and loop transformations are carried out to expose more opportunities for connguration reuse. We then present a two pass solution. The rst pass attempts to group the statements that havèsimilar' connguration demands together into cutsets and generates the corresponding conngurations. The second pass analyzes the trade-oos between costs and beneets of reconngurations across diierent cutsets and attempts to eliminate the reconnguration overheads by merging cutsets. This methodology is implemented in the SUIF compilation system and is tested using some loops extracted from Perfect benchmarks and Livermore kernels. The speed-ups obtained are quite substantial showing the usefulness of the method. The method also scales well with the loop sizes and the amount of space available on FPGAs.

Extracted Key Phrases

Cite this paper

@inproceedings{Ramasubramanian1998AutomaticAO, title={Automatic Analysis of Loops to}, author={Narasimhan Ramasubramanian and Ram Subramanian and Santosh Pande}, year={1998} }