Learn More
ÐMuch emphasis is now placed on chip-multiprocessor (CMP) architectures for exploiting thread-level parallelism in an application. In such architectures, speculation may be employed to execute applications that cannot be parallelized statically. In this paper, we present an efficient CMP architecture for speculative execution of sequential binaries without(More)
This paper presents the Alpha EV8 conditional branch predictor The Alpha EV8 microprocessor project, canceled in June 2001 in a late phase of development, envisioned an aggressive 8-wide issue out-of-order superscalar microarchitecture featuring a very deep pipeline and simultaneous multithreading. Performance of such a processor is highly dependent on the(More)
Chip-multiprocessors (CMP) are a promising approach for exploiting the increasing transistor count on a chip. To allow sequential applications to be executed on this architecture, current proposals incorporate hardware support to exploit speculative parallelism. However, these proposals either require re-compilation of the source program or use substantial(More)
With processor and memory technologies pushing the performance limit, the bottleneck is clearly shifting towards the system interconnect. Any solution that addresses PCI's bus-based interconnect, which has serious scalability problems, must also protect the huge legacy infrastructure. PCI Express provides such an evolutionary approach and allows a smooth(More)
Chip-multiprocessor (CMP) architectures are a promising design alternative to exploit the ever-increasing number of transistors that can be put on a die. To deliver high performance on applications that cannot be easily parallelized, CMPs can use additional support for speculatively executing the possibly data-dependent threads of an application.While some(More)
— Dolphin's DX interconnect, comprising of PCI Express based hardware and accompanying software is an industry first solution for seamlessly integrating IO and clustering capabilities onto an enhanced PCI Express interconnect. This one of a kind solution eliminates the need for two distinct interconnects – one dedicated to IO and the other to clustering.(More)
In this paper we exploit the existence of distant parallelism that future compilers could detect and characterise its performance under simultaneous multithreading architectures. By distant parallelism we mean parallelism that can not be captured by the processor instruction window and that can produce threads suitable for parallel execution in a(More)