Portable parallel performance from sequential, productive, embedded domain-specific languages

Abstract

Domain-expert <i>productivity programmers</i> desire scalable application performance, but usually must rely on <i>efficiency programmers</i> who are experts in explicit parallel programming to achieve it. Since such programmers are rare, to maximize reuse of their work we propose encapsulating their strategies in mini-compilers for domain-specific embedded languages (DSELs) glued together by a common high-level host language familiar to productivity programmers. The nontrivial applications that use these DSELs perform up to 98% of peak attainable performance, and comparable to or better than existing hand-coded implementations. Our approach is unique in that each mini-compiler not only performs conventional compiler transformations and optimizations, but includes imperative procedural code that captures an efficiency expert's strategy for mapping a narrow domain onto a specific type of hardware. The result is source- and performance-portability for productivity programmers and parallel performance that rivals that of hand-coded efficiency-language implementations of the same applications. We describe a framework that supports our methodology and five implemented DSELs supporting common computation kernels. Our results demonstrate that for several interesting classes of problems, efficiency-level parallel performance can be achieved by packaging efficiency programmers' expertise in a reusable framework that is easy to use for both productivity programmers and efficiency programmers.

DOI: 10.1145/2145816.2145865

Extracted Key Phrases

11 Figures and Tables

Cite this paper

@inproceedings{Kamil2012PortablePP, title={Portable parallel performance from sequential, productive, embedded domain-specific languages}, author={Shoaib Kamil and Derrick Coetzee and Scott Beamer and Henry Cook and Ekaterina Gonina and Jonathan Harper and Jeffrey Morlan and Armando Fox}, booktitle={PPOPP}, year={2012} }