Trace Register Allocation Policies: Compile-time vs. Performance Trade-offs

  title={Trace Register Allocation Policies: Compile-time vs. Performance Trade-offs},
  author={Josef Eisl and Stefan Marr and Thomas W{\"u}rthinger and Hanspeter M{\"o}ssenb{\"o}ck},
  journal={Proceedings of the 14th International Conference on Managed Languages and Runtimes},
  • J. EislStefan Marr H. Mössenböck
  • Published 27 September 2017
  • Computer Science
  • Proceedings of the 14th International Conference on Managed Languages and Runtimes
Register allocation is an integral part of compilation, regardless of whether a compiler aims for fast compilation or optimal code quality. State-of-the-art dynamic compilers often use global register allocation approaches such as linear scan. Recent results suggest that non-global trace-based register allocation approaches can compete with global approaches in terms of allocation quality. Instead of processing the whole compilation unit (i.e., method) at once, a trace-based register allocator… 

Figures from this paper

Parallel trace register allocation

A theoretical model for parallel register allocation is developed and it is shown that it can be used in practice without a negative impact on the quality of the allocation result and reduces compilation latency, i.e., the duration until the result of a compilation is available.

Efficient global register allocation

A new register allocation algorithm is described that solves an inability of the similarly motivated Treescan register allocator to look ahead of the instruction being allocated - allowing an unconstrained allocation order, and an ability to better handle fixed registers and loop carried values.

An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers

Inlining is one of the most important compiler optimizations. It reduces call overheads and widens the scope of other optimizations. But, inlining is somewhat of a black art of an optimizing

Toward Register Spilling Security Using LLVM and ARM Pointer Authentication

A security solution for spilled registers is presented, generalizing the use of ARM pointer authentication (PA) for this purpose, and the protection is enforced by the LLVM compiler via additional compiler passes and modifications.



Trace-based Register Allocation in a JIT Compiler

A novel non-global algorithm is proposed, which splits a compilation unit into traces based on profiling feedback and subsequently performs register allocation within each trace individually, which simplifies the problem a register allocator needs to solve.

Quality and speed in linear-scan register allocation

This paper implements both register allocators within the Machine SUIF extension of the Stanford SUIF compiler system and describes improvements to the linear-scan approach that allow it to produce code of a quality near to that produced by graph coloring.

Linear scan register allocation on SSA form

The linear scan register allocator of the Java HotSpot client compiler is modified so that it operates on SSA form, and the simpler and faster version generates equally good or slightly better machine code.

On local register allocation

It is shown that the Local Register Allocation problem is NP-hard, and a variant of the furthest-first heuristic achieves a good approximation ratio, and the experimental performance of a branch-and-bound algorithm and both approximation algorithms on standard benchmarks are reported.

Optimized interval splitting in a linear scan register allocator

An optimized implementation of the linear scan register allocation algorithm for Sun Microsystems' Java HotSpot™ client compiler is presented, with the high impact of the Intel SSE2 extensions on the speed of numeric Java applications.

Avoidance and suppression of compensation code in a trace scheduling compiler

The implementation of trace scheduling improves the SPEC mark rating by 30% overbasic block scheduling, but restricting trace scheduling so that no compensation code is required improves the rating by 25%.

LaTTe: a Java VM just-in-time compiler with fast and efficient register allocation

LaTTe is introduced, a Java JIT compiler that performs fast and efficient register mapping and allocation for RISC machines and includes an enhanced object model, a lightweight monitor, a fast mark-and-sweep garbage collector, and an on-demand exception handling mechanism, all of which are closely coordinated with LaTTe's JIT compilation.

Efficient JavaVM just-in-time compilation

  • A. Krall
  • Computer Science
    Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192)
  • 1998
A very fast algorithm for translating JavaVM byte code to high quality machine code for RISC processors, which replaces an older one in the CACAO JavaVM implementation reducing the compile time by a factor of seven and producing slightly faster machine code.

Trace-based compilation for the Java HotSpot virtual machine

This paper presents the implementation of a trace-based JIT compiler in which the mature, method-based Java HotSpot client compiler is modified and a bytecode preprocessing step is added that detects and directly marks loops within the bytecodes to simplify trace recording.

Register allocation for programs in SSA form

A novel register allocation architecture for programs in SSA-form is presented which simplifies register allocation significantly and a heuristic methods for spilling and coalescing are compared to an optimal method based on integer linear programming.