HotpathVM: an effective JIT compiler for resource-constrained devices

@inproceedings{Gal2006HotpathVMAE,
  title={HotpathVM: an effective JIT compiler for resource-constrained devices},
  author={Andreas Gal and Christian W. Probst and Michael Franz},
  booktitle={VEE '06},
  year={2006}
}
We present a just-in-time compiler for a Java VM that is small enough to fit on resource-constrained devices, yet is surprisingly effective. Our system dynamically identifies traces of frequently executed bytecode instructions (which may span several basic blocks across several methods) and compiles them via Static Single Assignment (SSA) construction. Our novel use of SSA form in this context allows to hoist instructions across trace side-exits without necessitating expensive compensation code… 
Ahead-of-Time Compilation of Stack-Based JVM Bytecode on Resource-Constrained Devices
TLDR
This paper identifies three distinct sources of overhead, two of which are related to the JVM’s stack-based architecture, and proposes a set of optimisations to target each of them, and reduces code size overhead by 59%.
Swift: a register-based JIT compiler for embedded JVMs
TLDR
This paper presents a fast and effective JIT technique for mobile devices, building on a register-based Java bytecode format which is more similar to the underlying machine architecture and proposes Swift, a novel JIT compiler on register- based bytecode, which generates native code for RISC machines.
Trace-based compilation and optimization in meta-circular virtual execution environments
TLDR
This dissertation explores an alternative approach in which only truly hot code paths are ever compiled, which compiles significantly less code and improves the performance of both statically and dynamically typed programming languages.
Trace-based just-in-time type specialization for dynamic languages
TLDR
This work presents an alternative compilation technique for dynamically-typed languages that identifies frequently executed loop traces at run-time and then generates machine code on the fly that is specialized for the actual dynamic types occurring on each path through the loop.
Improved Ahead-of-time Compilation of Stack-based JVM Bytecode on Resource-constrained Devices
TLDR
This article identifies the major sources of overhead resulting from this basic approach and presents optimisations to remove most of the remaining performance overhead, and over half the size overhead, reducing them to 67% and 77%, respectively.
Generalized just-in-time trace compilation using a parallel task farm in a dynamic binary translator
TLDR
An industry-strength, LLVM-based parallel DBT implementing the ARCompact ISA is evaluated against three benchmark suites and speedups of up to 2.08 on a standard quad-core Intel Xeon machine are demonstrated.
Generalized just-in-time trace compilation using a parallel task farm in a dynamic binary translator
TLDR
An industry-strength, LLVM-based parallel DBT implementing the ARCompact ISA is evaluated against three benchmark suites and speedups of up to 2.08 on a standard quad-core Intel Xeon machine are demonstrated.
Runtime feedback in a meta-tracing JIT for efficient dynamic languages
TLDR
The mechanisms in PyPy's meta-tracing JIT that can be used to control runtime feedback in language-specific ways are described, which are flexible enough to express classical VM techniques such as maps and runtime type feedback.
Context-sensitive trace inlining for Java
Allocation removal by partial evaluation in a tracing JIT
TLDR
This paper presents a simple compiler optimization based on online partial evaluation to remove object allocations and runtime type checks in the context of a tracing JIT and finds that it gives good results for all the authors' (real-life) benchmarks.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
Inlining java native calls at runtime
TLDR
This work leverages the ability to store statically-generated IL alongside native binaries, to facilitate native inlining at Java callsites at JIT compilation time and shows speedups of up to 93X when inlining and callback transformation are combined.
Armed E-Bunny: a selective dynamic compiler for embedded Java virtual machine targeting ARM processors
TLDR
This paper presents a new selective dynamic compilation technique targeting ARM 16/32-bit embedded system processors called Armed E-Bunny, and demonstrates that a speedup of 360% over the last version of Sun's KVM is accomplished with a footprint overhead that does not exceed 119KB.
Transparent Dynamic Optimization: The Design and Implementation of Dynamo
TLDR
The design and implementation of Dynamo is described, a prototype dynamic optimizer that is capable of optimizing a native program binary at runtime, and runs on a PA-RISC machine under the HPUX operating system.
Trace Scheduling: A Technique for Global Microcode Compaction
  • J. A. Fisher
  • Computer Science
    IEEE Transactions on Computers
  • 1981
TLDR
Compilation of high-level microcode languages into efficient horizontal microcode and good hand coding probably both require effective global compaction techniques.
Analysis and development of Java Grande benchmarks
TLDR
A range of Java benchmark programs are reviewed, including those collected by the Java Grande Forum as useful performance indicators for large-scale scientific applications, or “Java Grande applications”, and some general trends in the performance of Java on various platforms are observed.
Efficiently computing static single assignment form and the control dependence graph
TLDR
New algorithms that efficiently compute static single assignment form and control dependence graph data structures for arbitrary control flow graphs are presented and it is given that all of these data structures are usually linear in the size of the original program.
Concise specifications of locally optimal code generators
TLDR
The Twig code-Generator-generator is built, which produces dynamic-programming code-generators from grammar-like specifications that select very good (and under the right assumptions, optimal) instruction sequences.
Using profile information to assist classic code optimizations
TLDR
Experimental results show that the profile‐based code optimizer significantly improves the performance of production C programs that have already been optimized by a high‐quality global code Optimizer.
BEG: a generator for efficient back ends
This paper describes a system that generates compiler back ends from a strictly declarative specification of the code generation process. The generated back ends use tree pattern matching for code
Engineering a simple, efficient code-generator generator
TLDR
This paper describes a simple program that generates matchers that are fast, compact, and easy to understand and run up to 25 times faster than Twig's matchers.
...
1
2
3
...