Learn More
It has long been empirically observed that the cache miss rate decreased as a power law of cache size, where the power was approximately-1/2. In this paper, we examine the dependence of the cache miss rate on cache size both theoretically and through simulation. By combining the observed time dependence of the cache reference pattern with a statistical(More)
Ideally, a pipeline processor can run at a rate that is limited by its slowest stage. Branches in the instruction stream disrupt the pipeIine, and reduce processor performance to well below ideal. Since workloads contain a high percentage of taken branches, techniques are needed to reduce or eliminate thk degradation. A Branch History Table (BHT) stores(More)
Despite generation upon generation of scaling, computer chips have until now remained essentially 2-dimensional. Improvements in on-chip wire delay and in the maximum number of I/O per chip have not been able to keep up with transistor performance growth; it has become steadily harder to hide the discrepancy. 3D chip technologies come in a number of(More)
During the concept phase and definition of next generation high-end processors, power and performance will need to be weighted appropriately to deliver competitive cost/performance. It is not enough to adopt a CPl-centric view alone in early-stage definition studies. One of the fundamental issues confronting the architect at this stage is the choice of(More)
New methodology for early-stage, microarchitecture-level power–performance analysis of microprocessors The PowerTimer toolset has been developed for use in early-stage, microarchitecture-level power–performance analysis of microprocessors. The key component of the toolset is a parameterized set of energy functions that can be used in conjunction with any(More)
— A microprocessor implementing IBM S/390 architecture operates in a 10 + 2 way system at frequencies up to 411 MHz (2.43 ns). The chip is fabricated in a 0.2-m L e CMOS technology with five layers of metal and tungsten local interconnect. The chip size is 17.35 mm 2 17.30 mm with about 7.8 million transistors. The power supply is 2.5 V and measured power(More)
In this paper we present mechanisms that improve the accuracy and performance of history-based branch prediction. By studying the characteristics of the decision structures present in high-level languages, two mechanisms are proposed that reduce the number of wrong predictions made by a branch target buer (BTB). Execution-driven modeling is used to evaluate(More)