Learn More
/begin(thebibliography)(10) /bibitem(trimaran-presentation) The (Trimaran) compiler research infrastructure for instruction level parallelism. /newblock. pipelining and instruction level parallelism (ILP). placing data/instructions into the appropriate level of SPM to achieve the best We have undertaken comprehensive research work on instruction level(More)
This paper presents a novel hardware-based approach for identifying, profiling, and monitoring hot spots in order to support runtime optimization of general purpose programs. The proposed approach consists of a set of tightly coupled hardware tables and control logic modules that are placed in the retirement stage of a processor pipeline removed from the(More)
To exploit instruction level parallelism, compilers for VLIW and superscalar processors often employ static code scheduling. However, the available code reordering may be severely restricted due to ambiguous dependences between memory instructions. This paper introduces a simple hardware mechanism, referred to as the <italic>memory conflict buffer</italic>,(More)
The Java bytecode language is emerging as a software distribution standard. With major vendors committed to porting the Java run-time environment to their platforms, programs in Java bytecode are expected to run without modification on multiple platforms. These first generation run-time environments rely on an interpreter to bridge the gap between the(More)
Compiler-controlled speculative execution has been shown to be effective in increasing the available instruction levelparal-lelism (ILP) found in non-numeric programs. An important problem associated with compiler-controlled speculative execution is to accurately report and handle exceptions caused by speculatively executed instructions. Previous solutions(More)
Wide-issue processors continue to achieve higher performance by exploiting greater instruction-level par-allelism. Dynamic techniques such as out-of-order execution and hardware speculation have proven effective at increasing instruction throughput. Run-time optimization promises to provide an even higher level of performance by adaptively applying(More)
This paper introduces a new architectural approach that supports compiler-synthesized dynamic branch predi-cation. In compiler-synthesized dynamic branch prediction, the compiler generates code sequences that, when executed, digest relevant state information and execution statistics into a condition bit, or predicate. The hardware then utilizes this(More)
A machine description facility allows compiler writers to specify machine execution constraints to the optimization and scheduling phases of an instruction-level parallelism (ILP) optimizing compiler. The machine description (MDES) facility should support quick development and easy maintenance of machine execution constraint descriptions by compiler(More)
Achieving efficient and correct synchronization of multiple threads is a difficult and error-prone task at small scale and, as we march towards extreme scale computing, will be even more challenging when the resulting application is supposed to utilize millions of cores efficiently. Transactional Memory (TM) is a promising technique to ease the burden on(More)