Dynamic native optimization of interpreters

@inproceedings{Sullivan2003DynamicNO,
  title={Dynamic native optimization of interpreters},
  author={Gregory T. Sullivan and Derek Bruening and Iris Baron and Timothy Garnett and Saman P. Amarasinghe},
  booktitle={IVME '03},
  year={2003}
}
For domain specific languages, "scripting languages", dynamic languages, and for virtual machine-based languages, the most straightforward implementation strategy is to write an interpreter. A simple interpreter consists of a loop that fetches the next bytecode, dispatches to the routine handling that bytecode, then loops. There are many ways to improve upon this simple mechanism, but as long as the execution of the program is driven by a representation of the program other than as a stream of… 

Figures from this paper

YETI: a graduallY extensible trace interpreter
TLDR
This paper describes how callable bodies help the Yeti interpreter to efficiently identify and run traces, and how the closely coupled dynamic compiler can fall back on the interpreter in various ways, permitting an incremental approach.
Optimization of dynamic languages using hierarchical layering of virtual machines
TLDR
This work explores the approach of taking an interpreter of a dynamic language and running it on top of an optimizing trace-based virtual machine, i.e., the authors run a guest VM onTop of a host VM, thus eliminating the need for a custom just-in-time compiler for the guest VM.
Mixed mode execution with context threading
TLDR
A preliminary version of the code generator which compiles a region into a sequence of direct calls to bytecode bodies, and compares the selection and dispatch effectiveness for three common region shapes: whole methods, partial methods, and SPECL traces is reported.
Simple optimizing JIT compilation of higher-order dynamic programming languages
TLDR
This work proposes a new approach and new techniques to build optimizing just-in-time compilers for dynamic languages with relatively good performance and low development effort and presents the experience of building a JIT compiler using these techniques for the Scheme language.
Tracing the meta-level: PyPy's tracing JIT compiler
TLDR
This paper shows how to guide tracing JIT compilers to greatly improve the speed of bytecode interpreters, and how to unroll the bytecode dispatch loop, based on two kinds of hints provided by the implementer of thebytecode interpreter.
Dynamic optimization of interpreters using DynamoRIO
TLDR
This thesis provides simple annotations to the interpreters and modify the trace creation methodology of DynamoRIO such that its traces correspond to frequent sequences of code in the high-level program (the interpreted program), rather than in the interpreter.
Catenation and specialization for Tcl virtual machine performance
TLDR
In the context of the Tcl VM, bytecodes are converted to native Sparc code, by concatenating the native instructions used by the VM to implement each bytecode instruction, and the dispatch loop is eliminated.
Meta-tracing makes a fast Racket
TLDR
The result of spending just a couple person-months implementing and tuning an implementation of Racket written in RPython is presented, with a geometric mean equal to Racket’s performance and within a factor of 2 slower than Gambit and Larceny on a collection of standard Scheme benchmarks.
Retargeting JIT compilers by using C-compiler generated executable code
  • M. Ertl, David Gregg
  • Computer Science
    Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004.
  • 2004
TLDR
This paper proposes to combine the advantages of these language implementation techniques as follows: it generates native code by concatenating and patching machine code fragments taken from interpreter-derived code (generated by a C compiler); it completely eliminates the interpreter dispatch overhead and accesses to the interpreted code by patching jump target addresses and other constants into the fragments.
Dynamic optimization if IA-32 applications under DynamoRIO
TLDR
This thesis presents two uses of the DynamoRIO runtime introspection and modification system to optimize applications at runtime and a proof of concept for accelerating the performance of IA-32 applications running on the Itanium processor by dynamically translating hot trace paths into Itanium IA-64 assembly through DynamoRIR.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 79 REFERENCES
Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility
  • K. Ebcioglu, E. Altman
  • Computer Science
    Conference Proceedings. The 24th Annual International Symposium on Computer Architecture
  • 1997
TLDR
The architectural requirements for such a VLIW, to deal with issues including self-modifying code, precise exceptions, and aggressive reordering of memory references in the presence of strong MP consistency and memory mapped I/O are discussed.
Optimizing direct threaded code by selective inlining
TLDR
It is demonstrated that a few simple techniques make it possible to create highly-portable dynamic translators that can attain as much as 70% the performance of optimized C for certain numerical computations.
A general approach for run-time specialization and its application to C
TLDR
This paper describes a general approach to run-time specialization that automatically produces source templates at compile time, and transforms them so that they can be processed by a standard compiler, and is efficient, as shown by the implementation for the C language.
Fast, effective code generation in a just-in-time Java compiler
TLDR
The structure of a Java JIT compiler for the Intel Architecture is presented, the lightweight implementation of JIT compilation optimizations are described, and the performance benefits and tradeoffs of the optimizations are evaluated.
An API for Runtime Code Patching
TLDR
The authors present a postcompiler program manipulation tool called Dyninst, which provides a C++ class library for program instrumentation that permits machine-independent binary instrumentation programs to be written.
alto: a link-time optimizer for the Compaq Alpha
TLDR
Alto, a link-time optimizer for the Compaq Alpha architecture, is described, able to realize significant performance improvements for programs compiled with a good optimizing compiler with a high level of optimization.
Optimizing ML with run-time code generation
TLDR
This work describes the design and implementation of a compiler that automatically translates ordinary programs written in a subset of ML into code that generates native code at run time, and demonstrates how compile-time specialization can reduce the cost of run-time code generation by an order of magnitude.
Optimizing ML with run-time code generation
TLDR
This work describes the design and implementation of a compiler that automatically translates ordinary programs written in a subset of ML into code that generates native code at run time, and demonstrates how compile-time specialization can reduce the cost of run-time code generation by an order of magnitude.
Machine-adaptable dynamic binary translation
TLDR
This research provides for a more general framework for dynamic translations, by providing a framework based on specifications of machines that can be reused or adapted to new hardware architectures, and some initial results obtained by using this system.
Dynamo: A Staged Compiler Architecture for Dynamic Program Optimization
TLDR
Recent research has shown that dynamic compilation can dramatically improve the performance of a wide range of applications including network packet demultiplexing, sparse matrix computations, pattern matching, and many forms of mobile code.
...
1
2
3
4
5
...