Building-Blocks for Performance Oriented DSLs

@inproceedings{Rompf2011BuildingBlocksFP,
  title={Building-Blocks for Performance Oriented DSLs},
  author={Tiark Rompf and Arvind K. Sujeeth and HyoukJoong Lee and Kevin J. Brown and Hassan Chafi and Martin Odersky and Kunle Olukotun},
  booktitle={DSL},
  year={2011}
}
Domain-specific languages raise the level of abstraction in software development. While it is evident that programmers can more easily reason about very high-level programs, the same holds for compilers only if the compiler has an accurate model of the application domain and the underlying target platform. Since mapping high-level, general-purpose languages to modern, heterogeneous hardware is becoming increasingly difficult, DSLs are an attractive way to capitalize on improved hardware… 

Figures from this paper

StagedSAC: a case study in performance-oriented DSL development
TLDR
This case study describes how the author was able to quickly evolve from a pure library DSL to a performance-oriented compiler with a good speedup and only minor syntax changes using the technique of Lightweight Modular Staging.
A Heterogeneous Parallel Framework for Domain-Specific Languages
TLDR
A new end-to-end system for building, compiling, and executing DSL applications on parallel heterogeneous hardware, the Delite Compiler Framework and Runtime is presented and results comparing the performance of several machine learning applications written in OptiML are presented.
Composition and Reuse with Compiled Domain-Specific Languages
TLDR
Four new performance-oriented DSLs developed with Delite, an extensible DSL compilation framework are presented, demonstrating new techniques to compose compiled DSLs embedded in a common backend together in a single program and showing that generic optimizations can be applied across the different DSL sections.
GraphIt to CUDA Compiler in 2021 LOC: A Case for High-Performance DSL Implementation via Staging with BuilDSL
TLDR
This paper demonstrates how to build an end-to-end DSL compiler framework and a graph DSL using multi-stage programming in C++ and shows how the staged types can be extended to perform domain-specific data flow and control flow analyses and transformations.
Compile-Time Type-Driven Data Representation Transformations in Object-Oriented Languages
TLDR
This thesis presents miniboxing, a compile-time transformation that replaces generic classes by more efficient variants, optimized to handle primitive types, and Data-centric Metaprogramming, a technique that allows programmers to go beyond standard compiler optimizations by defining custom representations to be used for their data.
Lightweight Modular Staging and Embedded Compilers: Abstraction without Regret for High-Level High-Performance Programming
TLDR
This thesis proposes a hybrid design: Integrate compilers into programs so that programs can take control of the translation process, but rely on libraries of common compilerfunctionality for help.
First-class isomorphic specialization by staged evaluation
TLDR
A systematic approach and formalized framework for implementing software components with a first-class specialization capability and shows how to extend a higher-order functional language with abstraction mechanisms carefully designed to provide automatic and guaranteed elimination of abstraction overhead.
Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs
TLDR
It is argued that lightweight modular staging enables a form of language virtualization, i.e. allows to go from a pure-library embedded language to one that is practically equivalent to a stand-alone implementation with only modest effort.
Extensible Languages: Blurring the Distinction between DSL and GPL
TLDR
The purpose of this chapter is to provide a tour of the features that make a GPL extensible, and to demonstrate how the distinction between DSL and GPL can blur, sometimes to the point of complete disappearance.
Spiral in scala: towards the systematic construction of generators for performance libraries
TLDR
This paper abstracts over different complex data representations jointly with different code representations including generating loops versus unrolled code with scalar replacement - a crucial and usually tedious performance transformation.
...
...

References

SHOWING 1-10 OF 59 REFERENCES
A domain-specific approach to heterogeneous parallelism
TLDR
Delite is introduced, a system designed specifically for DSLs that is both a framework for creating an implicitly parallel DSL as well as a dynamic runtime providing automated targeting to heterogeneous parallel hardware.
Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language
TLDR
This paper introduces Intel® Array Building Blocks (ArBB), which is a retargetable dynamic compilation framework that focuses on making it easier to write and port programs so that they can harvest data and thread parallelism on both multi-core and heterogeneous many-core architectures, while staying within standard C++.
Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs
TLDR
It is argued that lightweight modular staging enables a form of language virtualization, i.e. allows to go from a pure-library embedded language to one that is practically equivalent to a stand-alone implementation with only modest effort.
Design of the CodeBoost transformation system for domain-specific optimisation of C++ programs
TLDR
This work presents CodeBoost, a source-to-source transformation tool for domain-specific optimisation of C++ programs, and takes a closer look at two important features of CodeBoost: user-defined rules and totem annotations.
Telescoping Languages: A System for Automatic Generation of Domain Languages
TLDR
The approach calls for using a library-preprocessing phase to extensively analyze and optimize collections of libraries that define an extended language, which enables script optimization to benefit from the intense analysis performed during preprocessing without repaying its price.
An annotation language for optimizing software libraries
TLDR
An annotation language and a compiler that together can customize a library implementation for specific application needs are introduced and it is shown how the system can significantly improve the performance of two applications written using the PLAPACK parallel linear algebra library.
Runtime Code Generation in C++ as a Foundation for Domain-Specific Optimisation
TLDR
This Chapter presents the design of the TaskGraph library, and two sample applications to demonstrate its use for runtime code specialisation and restructuring optimisation.
Language virtualization for heterogeneous parallel computing
TLDR
This work proposes language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language.
Template meta-programming for Haskell
TLDR
A new extension to the purely functional programming language Haskell that supports compile-time meta-programming and the ability to generate code at compile time allows the programmer to implement such features as polytypic programs, macro-like expansion, user directed optimization, and the generation of supporting data structures and functions from existing data structure and functions.
A methodology for generating verified combinatorial circuits
TLDR
This paper investigates the use of RAP languages for the generation of combinatorial circuits and proposes and studies theUse of abstract interpretation to overcome the key challenge that the RAP approach does not safely admit a mechanism to express a posteriori (post-generation) optimizations.
...
...