Learn More
A compiler for VLIW and superscalar processors must expose sufficient instruction-level parallelism (ILP) to effectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We have developed a set of techniques for exploiting ILP across basic block boundaries. These techniques are based on a(More)
Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of registers, this increased register requirement can be a factor that limits the effectivess of the compiler. In this paper, we introduce a new architectural method(More)
By exploiting fine grain parallelism, superscalar processors can potentially increase the performance of future supercomputers. However, supercomputers typically have a long access delay to their first level memory which can severely restrict the performance of superscalar processors. Compilers attempt to move load instructions far enough ahead to hide this(More)
Compilers for superscalar and VLIW processors must expose suucient instruction-level parallelism in order to achieve high performance. Compile-time code transformations which expose instruction-level parallelism typically take into account the constraints imposed by all execution scenarios in the program. However, there are additional opportunities to(More)
Moderate size register les can limit the performance of loop unrolling on multiple issue processors. With current scheduling heuristics, a breadth-rst scheduling of iterations occurs, increasing register pressure and generating excessive spill code. A heuristic is proposed that causes a more depth-rst scheduling of unrolled iterations. This heuristic(More)
  • 1