Energy dissipation in general purpose microprocessors

  title={Energy dissipation in general purpose microprocessors},
  author={Ricardo Gonzalez and Mark Horowitz},
  journal={IEEE Journal of Solid-state Circuits},
  • R. Gonzalez, M. Horowitz
  • Published 1 September 1996
  • Computer Science, Physics
  • IEEE Journal of Solid-state Circuits
In this paper we investigate possible ways to improve the energy efficiency of a general purpose microprocessor. We show that the energy of a processor depends on its performance, so we chose the energy-delay product to compare different processors. To improve the energy-delay product we explore methods of reducing energy consumption that do not lead to performance loss (i.e. wasted energy), and explore methods to reduce delay by exploiting instruction level parallelism. We found that careful… 

Figures and Tables from this paper

Memory Redundancy Elimination to Improve Application Energy Efficiency
Energy-estimation tools are used to profile the execution of benchmark applications and show that memory redundancy elimination can significantly reduce energy in the processor clocking network and the instruction and data caches.
On the energy-efficiency of speculative hardware
A simple method to accurately compare the energy-efficiency of speculative architectures is introduced based on runtime analysis of the entire processor chip and thus captures the energy consumption due to the positive as well as the negative activities that arise from the speculation activities.
Reducing Power Consumption of the Issue Logic
A technique to reduce the power consumption of the issue logic of superscalar processors is presented and a technique to dynamically resize the instruction queue based on the existing parallelism in diierent periods of the execution is proposed.
Using dynamic cache management techniques to reduce energy in general purpose processors
This paper proposes, implements, and evaluates five techniques for dynamic analysis of the program instruction access behavior, which are then used to proactively guide the access of the LO-Cache, a mini cache located between the I-Cache and the CPU core.
Power analysis and instruction scheduling for reduced di/dt in the execution core of high-performanc
A novel approach to instruction scheduling based on the concept of schedule slack, which builds energy e cient schedules by limiting the energy dissipated in a single cycle is proposed resulting in a decrease in the execution core's average peak power dissipation.
Microprocessor energy characterization and optimization through fast, accurate, and flexible simulation
The design of a fast, accurate, and flexible circuit simulation tool is described which enables transition-sensitive studies of microprocessor energy consumption that would otherwise be impossible or impractical and serves as a basis and motivation for further energy optimizations.
Computer Architecture Techniques for Power-Efficiency
This book aims to document some of the most important architectural techniques that were invented, proposed, and applied to reduce both dynamic power and static power dissipation in processors and memory hierarchies by focusing on their common characteristics.
Design and Evaluation of an Energy-Saving Real-Time Microprocessor
  • U. Brinkschulte, D. Lohn, Michael Bauer
  • Computer Science, Engineering
    2014 IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing
  • 2014
This paper presents an architecture of a microprocessor core enhanced with a closed control loop on hardware level that is able to deliver an aimed performance, thus holding execution time bonds.
Energy-Exposed Instruction Set Architectures
This short paper reports on ongoing work in the SCALE (Software-Controlled Architectures for LowEnergy) project at MIT, where new energy-exposedhardware-software interfaces that give software fine-grain control over energy consumption are developed.
A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth
Using this method to determine the proper PSU configurations during the program execution, the proposed method is able to achieve an averaged 13.5% energy-delay-product (EDP) reduction for SPEC CPU2000 integer benchmarks, compared to the baseline processor.


A 0.6 /spl mu/m BiCMOS processor with dynamic execution
  • R. Colwell, R. Steck
  • Computer Science
    Proceedings ISSCC '95 - International Solid-State Circuits Conference
  • 1995
A next generation, Intel-Architecture compatible microproceesor with dynamic execution is implemented in 0.60 /spl mu/m 4-layer metal BiCMOS to allow complete access to all structures without the overhead of a full LSSD implementation.
A 64-b microprocessor with multimedia support
Strict design methodology allowed fully functional first silicon which met all speed targets and high clock speed was obtained by the use of delayed reset logic, a new register file design; and novel comparators.
Low-power digital design
Recently there has been a surge of interest in low-power devices and design techniques. While many papers have been published describing power-saving techniques for use in digital systems, trade-offs
A 32b 66 MHz 1.8 W microprocessor
A high-performance 32b CMOS microprocessor with an on-chip cache and low power for functional and standby modes has performance of 26MIPS and dissipates only 1.77 W in functional mode. The features
A 300-MHz 64-b quad-issue CMOS RISC microprocessor
This 300 MHz quad-issue custom VLSI implementation of the Alpha architecture delivers 1200 MIPS, 600 MFLOPS, 341 SPECint92, and 512 SPECfp92 and is packaged in a 499-pin ceramic IPGA.
Support for Speculative Execution in High-Performance Processors
Boosting is defined, an architectural mechanism for speculative execution that allows us to uncover the instruction-level parallelism across conditional branches without adversely affecting the instruction count of the application or the cycle time of the processor.
A 1 watt 68040-compatible microprocessor
A 68040-compatible microprocessor, optimized for portable applications, is presented and a new metric is defined to compare the relative merit of quiescent states in microprocessors.
Available instruction-level parallelism for superscalar and superpipelined machines
A parameterizable code reorganization and simulation system was developed and used to measure instruction-level parallelism and the average degree of superpipelining metric is introduced, suggesting that this metric is already high for many machines.
Limits on multiple instruction issue
This paper investigates the limitations on designing a processor which can sustain an execution rate of greater than one instruction per cycle on highly-optimized, non-scientific applications and determines that these applications contain enough instruction independence to sustain an instruction rate of about two instructions per cycle.
Low-power CMOS digital design
An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations, and is achieved by trading increased silicon area for reduced power consumption.