Wattch: a framework for architectural-level power analysis and optimizations

@article{Brooks2000WattchAF,
  title={Wattch: a framework for architectural-level power analysis and optimizations},
  author={David M. Brooks and Vivek Tiwari and Margaret Martonosi},
  journal={Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201)},
  year={2000},
  pages={83-94}
}
  • D. Brooks, V. Tiwari, M. Martonosi
  • Published 1 May 2000
  • Computer Science
  • Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201)
Power dissipation and thermal issues are increasingly significant in modern processors. As a result, it is crucial that power/performance tradeoffs be made more visible to chip architects and even compiler writers, in addition to circuit designers. Most existing power analysis tools achieve high accuracy by calculating power estimates for designs only after layout or floorplanning are complete. In addition to being available only late in the design process, such tools are often quite slow… 
Processor power estimation techniques: a survey
TLDR
This paper provides a survey of most of the processor specific power estimation techniques proposed after the mid nineties and broadly focus on estimating power using system level models, architectural simulation, hardware performance counters, on-chip temperature profiles, and program execution profiles.
A hardware/software co-design architecture for thermal, power, and reliability management in chip multiprocessors
TLDR
A low overhead and scalable hardware based program phase classification scheme, termed as Instruction Type Vectors (ITV) which captures the execution frequencies of committed instruction types over profiling intervals and subsequently classifies and detects phases within threads is proposed.
Microarchitectural Level Power Analysis And Optimization In Single Chip Parallel Computers
TLDR
This thesis proposes a microarchitectural level power estimation and analysis infrastructure for Single Chip Parallel Computers and focuses on the development of power estimation models, construction of the power analysis tool, study of thePower advantages of the architecture and identification of subsystems requiring power optimization.
An Energy-Aware Architectural Exploration Tool for ARM-Based SOCs
TLDR
An energy-aware architectural design exploration and analysis tool for ARM based system-on-chip designs that integrates the behavior and energy models of several user-defined, custom processing units as an extension to the cycle-accurate instructionlevel simulator for the ARM low-power processor family, called the ARMulator.
The XTREM power and performance simulator for the Intel XScale core: Design and experiences
TLDR
In building XTREM, the goals were to develop a microarchitecture simulator that, while still offering size parameterizations for cache and other structures, more accurately reflected a realistic processor pipeline.
A Systematic Methodology for Reliability Improvements on SoC-Based Software Defined Radio Systems
TLDR
This paper proposes a thermal-aware exploration framework targeting temperature hotspots elimination through the efficient exploration of multiple microarchitecture selections over the temperature-area trade-off curve and finds that this methodology leads to an architecture that exhibits temperature reduction, which leads to improvement against aging phenomena about 14%, with a controllable overhead in silicon area about 15%, compared to the initial LEON3 instance.
A presentation and low-level energy usage analysis of two low-power architectural techniques
TLDR
This dissertation analyzes two architectural techniques that are designed to reduce the energy usage required to complete computational tasks, without impacting performance, and presents an analysis of physical models of a traditionally-architected 5-stage OpenRISC processor, an implementation of the TH-IC, and a statically pipelined processor.
Design space exploration for multicore architectures: a power/performance/thermal view
TLDR
A thorough evaluation of multicore architectures, examining the design space related to the number of cores, L2 cache size and processor complexity, and showing the behavior of the different configurations/applications with respect to performance, energy consumption and temperature.
A Multi-Granularity Power Modeling Methodology for Embedded Processors
TLDR
This paper proposes a unified processor power modeling methodology for the creation of power models at multiple granularity levels that can be quickly mapped to an ESL design flow and demonstrates the usefulness of having multiple power models.
Power reduction through work reuse [superscalar processor microarchitecture]
  • E. Talpes, Diana Marculescu
  • Computer Science
    ISLPED'01: Proceedings of the 2001 International Symposium on Low Power Electronics and Design (IEEE Cat. No.01TH8581)
  • 2001
TLDR
By modifying the well-established out-of-order, superscalar processor architecture, significant gains can be achieved in terms of power requirements without performance penalty, and Experimental results show up to 52% savings in average energy per committed instruction for two different pipeline structures.
...
...

References

SHOWING 1-10 OF 34 REFERENCES
Reducing power in high-performance microprocessors
TLDR
The main trends that are driving the increased focus on design for low power are described and areas that need increased research focus in the future are also pointed out.
Energy dissipation in general purpose microprocessors
TLDR
It is found that careful design reduced the energy dissipation by almost 25% and methods of reducing energy consumption that do not lead to performance loss, and methods to reduce delay by exploiting instruction level parallelism are explored.
Power and performance tradeoffs using various caching strategies
  • R. I. Bahar, G. Albera, S. Manne
  • Computer Science
    Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379)
  • 1998
TLDR
It is shown that, by using buffers, energy consumption of the memory subsystem may be reduced by as much as 13% for certain data cache configurations and by asmuch as 23% forcertain instruction cache configurations without adversely effecting processor performance or on-chip energy consumption.
The filter cache: an energy efficient memory structure
TLDR
Experimental results across a wide range of embedded applications show that the filter cache results in improved memory system energy efficiency, and this work proposes to trade performance for power consumption by filtering cache references through an unusually small L1 cache.
Thermal management system for high performance PowerPC/sup TM/ microprocessors
TLDR
The next-generation PowerPC/sup TM/ microprocessor includes a thermal assist unit (TAU) comprised of an on-chip thermal sensor and associated logic and dynamically adjusts processor operation to provide maximum performance under changing environmental conditions.
Complexity-Effective Superscalar Processors
TLDR
A microarchitecture that simplifies wakeup and selection logic is proposed and discussed, which will help minimize performance degradation due to slow bypasses in future wide-issue machines.
Dynamically exploiting narrow width operands to improve processor power and performance
  • D. Brooks, M. Martonosi
  • Computer Science
    Proceedings Fifth International Symposium on High-Performance Computer Architecture
  • 1999
TLDR
This work proposes hardware mechanisms that dynamically recognize and capitalize on "narrow-bitwidth" instances and reduces processor power consumption by using aggressive clock gating to turn off portions of integer arithmetic units that will be unnecessary for narrow bitwidth operations.
Cache designs for energy efficiency
  • Ching-Long Su, A. Despain
  • Computer Science
    Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences
  • 1995
TLDR
Experimental results suggest that both the block buffering and Gray code addressing techniques are ideal for instruction cache designs which tend to be accessed in a consecutive sequence and can achieve an order of magnitude energy reduction on caches.
The energy complexity of register files
  • V. Zyuban, P. Kogge
  • Computer Science
    Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379)
  • 1998
TLDR
It appears that none of these techniques will be enough to prevent centralized register files from becoming the dominant power component of next-generation superscalar computers, and alternative methods for inter-instruction communication need to be developed.
Pipeline gating: speculation control for energy reduction
TLDR
This paper introduces a hardware mechanism called pipeline gating to control rampant speculation in the pipeline, and presents inexpensive mechanisms for determining when a branch is likely to mispredict, and for stopping wrong-path instructions from entering the pipeline.
...
...