Trace-based Register Allocation in a JIT Compiler

@article{Eisl2016TracebasedRA,
  title={Trace-based Register Allocation in a JIT Compiler},
  author={Josef Eisl and Matthias Grimmer and Doug Simon and Thomas W{\"u}rthinger and Hanspeter M{\"o}ssenb{\"o}ck},
  journal={Proceedings of the 13th International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools},
  year={2016}
}
  • J. EislMatthias Grimmer H. Mössenböck
  • Published 29 August 2016
  • Computer Science
  • Proceedings of the 13th International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools
State-of-the-art dynamic compilers often use global approaches, like Linear Scan or Graph Coloring, for register allocation. [] Key Method Traces reduce the problem size to a single linear code segment, which simplifies the problem a register allocator needs to solve. Additionally, we can apply different register allocation algorithms to each trace. We show that this non-global approach can achieve results competitive to global register allocation. We present an implementation of Trace Register Allocation…

Figures and Tables from this paper

Trace Register Allocation Policies: Compile-time vs. Performance Trade-offs

This work presents a register allocation framework that can exploit the additional flexibility of traces to select different allocation strategies based on the characteristics of a trace, providing fine-grained control over the trade-off between compile time and peak performance in a just-in-time compiler.

Divide and Allocate : The Trace Register Allocation Framework ACM Student Research Competition Grand Finals

This work developed a novel trace register allocation framework which competes with global approaches in both compile time and code quality and is able to select di erent allocation strategies based on the characteristics of a trace to control the trade-o between compiling time and peak performance.

Parallel trace register allocation

A theoretical model for parallel register allocation is developed and it is shown that it can be used in practice without a negative impact on the quality of the allocation result and reduces compilation latency, i.e., the duration until the result of a compilation is available.

Irregular Register Allocation for Translation of Test-pattern Programs

This article proposes a solution based on partitioned Boolean quadratic programming (PBQP) for ATE register allocation that trades off the allocation time and allocation search space, and experimental results show that the proposed register allocator successfully finds valid solutions in all cases.

A cost model for a graph-based intermediate-representation in a dynamic compiler

A cost model for Graal’s high-level intermediate representation is proposed that models relative operation latencies and operation sizes in order to be used in trade-off functions of compiler optimizations, allowing optimizations to perform fine-grained code size and performance trade-offs outperforming hard-coded heuristics.

QuickCheck: using speculation to reduce the overhead of checks in NVM frameworks

This paper proposes QuickCheck, a technique that biases persistence checks based on their expected behavior, and exploits speculative optimizations to further reduce the overheads of these persistence checks.

Slower Compilation Times + Generates High Quality Code + Utilizes Profiling Information Optimizing Compiler Compiler Traits Baseline Compiler Optimizing Compiler Compilation Cycle

This paper proposes QuickCheck, a technique that biases persistence checks based on their expected behavior, and exploits speculative optimizations to further reduce the overheads of these persistence checks.

Renaissance: benchmarking suite for parallel applications on the JVM

Renaissance, a new benchmark suite composed of modern, real-world, concurrent, and object-oriented workloads that exercise various concurrency primitives of the JVM, is presented and it is shown that the use of concurrencyPrimitives in these workloads reveals optimization opportunities that were not visible with the existing workloads.

Scalable pointer analysis of data structures using semantic models

Pointer analysis is widely used as a base for different kinds of static analyses and compiler optimizations. Designing a scalable pointer analysis with acceptable precision for use in production

On Evaluating the Renaissance Benchmarking Suite: Variety, Performance, and Complexity

An overview of the experimental setup that was used to assess the variety and complexity of the Renaissance suite, as well as its amenability to new compiler optimizations, is given and the obtained measurements are presented.

References

SHOWING 1-10 OF 52 REFERENCES

Quality and speed in linear-scan register allocation

This paper implements both register allocators within the Machine SUIF extension of the Stanford SUIF compiler system and describes improvements to the linear-scan approach that allow it to produce code of a quality near to that produced by graph coloring.

Trace register allocation

Trace Register Allocation is proposed, a register allocation approach that is tailored for just-in-time (JIT) compilation in the context of virtual machines with run-time feedback to offload costly operations such as spilling and splitting to less frequently executed branches and to focus on efficient registers allocation for the hot parts of a program.

Optimized interval splitting in a linear scan register allocator

An optimized implementation of the linear scan register allocation algorithm for Sun Microsystems' Java HotSpot™ client compiler is presented, with the high impact of the Intel SSE2 extensions on the speed of numeric Java applications.

Linear scan register allocation on SSA form

The linear scan register allocator of the Java HotSpot client compiler is modified so that it operates on SSA form, and the simpler and faster version generates equally good or slightly better machine code.

Global Register Allocation Based on Graph Fusion

This paper presents a new coloring-based global register allocation algorithm that addresses all three issues in an integrated way: the algorithm starts with an interference graph for each region of the program, where a region can be a basic block, a loop nest, a superblock, a trace, or another combination of basic blocks.

A global progressive register allocator

An expressive model of global register allocation based on multicommodity network flows that explicitly represents spill code optimization, register preferences, copy insertion, and constant rematerialization, and a more elaborate progressive allocator that uses Lagrangian relaxation to compute the optimality of its allocations.

Register Allocation via Hierarchical Graph Coloring

It is shown that when register pressure is high, Callahan and Koblenz’s method generates worse code than Briggs’ method -the generally accepted method of graph coloring register allocation.

Register allocation by puzzle solving

We show that register allocation can be viewed as solving a collection of puzzles. We model the register file as a puzzle board and the program variables as puzzle pieces; pre-coloring and register

Linear scan register allocation

A new algorithm for fast global register allocation called linear scan, which allocates registers to variables in a single linear-time scan of the variables' live ranges, is described.
...