Architecting an Energy-Efficient DRAM System for GPUs
Low energy consumption is becoming the primary design consideration for battery-operated and portable embedded systems, such as personal digital assistants, digital still and movie cameras, digital music playback, medical devices, etc. For a typical processor-based system, the energy consumption of the processor-memory component is split roughly 40-60% between the processor and memory. In this paper, we study the impact of reordering memory bus traffic on reducing bus switching activity and power consumption. To conduct this study, we developed a software tool, called MPOWER, that lets an embedded system designer collect a trace of memory bus accesses and determine the switching activity of the trace, given design parameters such as data and address bus width, bus multiplexing, cache size, block size, etc. Using MPOWER, we measured the effectiveness of reordering memory accesses on switching activity. We found that for small caches, which are typical of embedded processors, the number of signal transitions in an ideal case can be reduced by an average of 53%. This paper also describes a practical hardware scheme for reordering the elements within a cache line to reduce switching activity. We found that cache line reordering reduces switching activity by 15–31%.