Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs

@article{Rompf2012LightweightMS,
  title={Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs},
  author={Tiark Rompf and Martin Odersky},
  journal={Commun. ACM},
  year={2012},
  volume={55},
  pages={121-130}
}
Good software engineering practice demands generalization and abstraction, whereas high performance demands specialization and concretization. These goals are at odds, and compilers can only rarely translate expressive high-level programs to modern hardware platforms in a way that makes best use of the available resources. Generative programming is a promising alternative to fully automatic translation. Instead of writing down the target program directly, developers write a program generator… 
Quoted staged rewriting: a practical approach to library-defined optimizations
TLDR
This work introduces Quoted Staged Rewriting (QSR), an approach that uses type-safe, pattern matching-enabled quasiquotes to define optimizations and renders library-defined optimizations more practical than ever before.
Language-integrated privacy-aware distributed queries
TLDR
This work presents a new methodology for privacy-aware operator placement that both prevents leakage of sensitive information and improves performance and implemented the type system and placement algorithm for a new query language SecQL and demonstrates significant performance improvements in benchmarks.
Staging for generic programming in space and time
Metaprogramming is among the most promising candidates to solve the abstraction vs performance trade-off that plagues software engineering through specialization. Metaprogramming has been used to
Fast and Modular Whole-Program Analysis via MetaProgramming
It is well known that a staged interpreter is a compiler: specializing the interpreter to a given program produces an equivalent program that runs faster. It is even more widely known that an
A SQL to C compiler in 500 lines of code
Abstract We present the design and implementation of a SQL query processor that outperforms existing database systems and is written in just about 500 lines of Scala code – a convincing case study
Yin-yang: concealing the deep embedding of DSLs
TLDR
Yin-Yang, a framework for DSL embedding that uses Scala macros to reliably translate shallow E DSL programs to the corresponding deep EDSL programs, and automatically generates the deep DSL embeddings from their shallow counterparts by reusing the core translation.
Spiral in scala: towards the systematic construction of generators for performance libraries
TLDR
This paper abstracts over different complex data representations jointly with different code representations including generating loops versus unrolled code with scalar replacement - a crucial and usually tedious performance transformation.
Handling Iterations in Distributed Dataflow Systems
TLDR
This survey reviews the research literature and identifies how DDS handle control flow, such as iteration, from both the programming model and execution level perspectives and will be of interest for both users and designers of DDS.
LLJava live at the loop: a case for heteroiconic staged meta-programming
TLDR
LLJava-live, the staged API of the low-level JVM language LLJava, can be used to complement an interpreted EDSL with orthogonal and extensible compilation facilities to accelerate embedded domain-specific languages on the Java platform.
Metaprogramming with combinators
TLDR
This work advocates for a point in the design space, which is metaprogramming with combinators, where programmers use (and write) combinator libraries that directly manipulate object language terms, to provide what is essentially a rich, well-typed macro language.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 25 REFERENCES
Building-Blocks for Performance Oriented DSLs
TLDR
The Delite Framework is presented, an extensible toolkit that drastically simplifies building embedded DSLs and compiling DSL programs for execution on heterogeneous hardware.
Implementing Domain-Specific Languages for Heterogeneous Parallel Computing
TLDR
The Delite compiler framework simplifies the process of building embedded parallel DSLs and automatically schedules and executes DSL operations on heterogeneous hardware.
Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages
TLDR
This family of tagless interpretations for a higher-order typed object language in a typed metalanguage (Haskell or ML) that require no dependent types, generalized algebraic data types, or postprocessing to eliminate tags demonstrates again that it is useful to abstract over higher-kinded types.
Shifting the stage: staging with delimited control
TLDR
The first two-level calculus with control effects and a sound type system is introduced, which can finally be written efficient code generators for dynamic programming and numerical methods in direct style, like in algorithm textbooks, rather than in CPS or monadic style.
Polymorphic embedding of dsls
TLDR
With polymorphic embedding of DSLs, the static type-safety, modularity, composability and rapid prototyping of pure embedding are reconciled with the flexibility attainable by external toolchains.
A monadic approach for avoiding code duplication when staging memoized functions
TLDR
A staged monadic combinator library to solve the problem of code duplication that can arise when memoized functions are staged, and for any function that uses memoization.
In search of a program generator to implement generic transformations for high-performance computing
TLDR
By mimicking complex sequences of transformations useful to optimize real codes, it is shown that generative programming is a practical means to implement architecture-aware optimizations for high-performance applications and that complex, architecture-specific optimizations can be implemented in a type-safe, purely generative framework.
Scalable component abstractions
TLDR
Three programming language abstractions are identified for the construction of reusable components: abstract type members, explicit selftypes, and modular mixin composition, which enable an arbitrary assembly of static program parts with hard references between them to be transformed into a system of reusable component.
A methodology for generating verified combinatorial circuits
TLDR
This paper investigates the use of RAP languages for the generation of combinatorial circuits and proposes and studies theUse of abstract interpretation to overcome the key challenge that the RAP approach does not safely admit a mechanism to express a posteriori (post-generation) optimizations.
Active libraries and universal languages
Universal programming languages are an old dream. There is the computability sense of Turing-universal; Landin and others have advocated syntactically universal languages, a path leading to
...
1
2
3
...