Learn More
A compiler for VLIW and superscalar processors must expose sufficient instruction-level parallelism (ILP) to effectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We have developed a set of techniques for exploiting ILP across basic block boundaries. These techniques are based on a(More)
<i>This paper explores Speculative Precomputation, a technique that uses idle thread context in a multithreaded architecture to improve performance of single-threaded applications. It attacks program stalls from data cache misses by pre-computing future memory accesses in available thread contexts, and prefetching these data. This technique is evaluated by(More)
Recently, a number of thread-based prefetching techniques have been proposed. These techniques aim at improving the latency of single-threaded applications by leveraging multithreading resources to perform memory prefetching via speculative prefetch threads. Software-based speculative precomputation (SSP) is one such technique, proposed for multithreaded(More)
Much of the previous work on modulo scheduling has targeted numeric programs, in which, often, the majority of the loops are well-behaved loop-counter-based loops without early exits. In control-intensive non-numeric programs, the loops frequently have characteristics that make it more difficult to effectively apply modulo scheduling. These characteristics(More)
In this paper, we evaluate the benefits achievable from pointer analysis and other memory disambiguation techniques for C/C++ programs, using the framework of the production compiler for the Intel&#174; Itanium#8482; processor. Most of the prior work on memory disambiguation has primarily focused on pointer analysis, and either presents only static(More)
Superscalar and superpipelined processors utilize parallelism to achieve peak performance that can be several times higher than that of conventional scalar processors. In order for this potential to be translated into the speedup of real programs, the compiler must be able to schedule instructions so that the parallel hardware is eeectively utilized.(More)