Learn More
A compiler for VLIW and superscalar processors must expose sufficient instruction-level parallelism (ILP) to effectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We have developed a set of techniques for exploiting ILP across basic block boundaries. These techniques are based on a(More)
<i>This paper explores Speculative Precomputation, a technique that uses idle thread context in a multithreaded architecture to improve performance of single-threaded applications. It attacks program stalls from data cache misses by pre-computing future memory accesses in available thread contexts, and prefetching these data. This technique is evaluated by(More)
Recently, a number of thread-based prefetching techniques have been proposed. These techniques aim at improving the latency of single-threaded applications by leveraging multithreading resources to perform memory prefetching via speculative prefetch threads. Software-based speculative precomputation (SSP) is one such technique, proposed for multithreaded(More)
In this paper, we evaluate the benefits achievable from pointer analysis and other memory disambiguation techniques for C/C++ programs, using the framework of the production compiler for the Intel&#174; Itanium#8482; processor. Most of the prior work on memory disambiguation has primarily focused on pointer analysis, and either presents only static(More)
Much of the previous work on modulo scheduling has targeted numeric programs, in which, often, the majority of the loops are well-behaved loop-counter-based loops without early exits. In control-intensive non-numeric programs, the loops frequently have characteristics that make it more difficult to effectively apply modulo scheduling. These characteristics(More)
Advances in hardware technology have made it possible for microprocessors to execute a large number of instructions concurrently i.e., in parallel. These microprocessors take advantage of the opportunity to execute instructions in parallel to increase the execution speed of a program. As in other forms of parallel processing, the performance of these(More)
A compiler for VLIW and superscalar processors must expose suucient instruction-level parallelism ILP to eeectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We h a v e developed a set of techniques for exploiting ILP across basic block boundaries. These techniques are based on a(More)
Software pipelining is a compile-time scheduling technique that overlaps successive loop iterations to expose operation-level parallelism. An important problem with the development of eeective software pipelin-ing algorithms is how to handle loops with conditional branches. Conditional branches increase the complexity and decrease the eeectiveness of(More)