• Publications
  • Influence
Leakage Current: Moore's Law Meets Static Power
Off-state leakage is static power, current that leaks through transistors even when they are turned off. The other source of power dissipation in today's microprocessors, dynamic power, arises from
Evaluating STT-RAM as an energy-efficient main memory alternative
TLDR
It is shown that an optimized, equal capacity STT-RAM main memory can provide performance comparable to DRAM main memory, with an average 60% reduction in main memory energy.
Neither more nor less: Optimizing thread-level parallelism for GPGPUs
TLDR
To reduce resource contention, this paper proposes a dynamic CTA scheduling mechanism, called DYNCTA, which modulates the TLP by allocating optimal number of CTAs, based on application characteristics, to minimize resource contention.
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory
TLDR
A router architecture and a topology design that makes use of a network architecture embedded into the L2 cache memory are proposed that demonstrate that a 3D L2 memory architecture generates much better results than the conventional two-dimensional designs under different number of layers and vertical connections.
The design and use of simplePower: a cycle-accurate energy estimation tool
TLDR
This paper uses the use of SimplePower to evaluate the impact of a new selective gated pipeline register optimization, a high-level data transformation and a pow er-conscious post compilation optimization on the datapath, memory and on-chip bus energy, respectively.
DRPM: dynamic speed control for power management in server class disks
TLDR
A new approach to modulate disk speed (RPM) dynamically is presented, and a practical implementation to exploit this mechanism is given, showing that DRPM can provide significant energy savings without compromising much on performance.
Reducing memory interference in multicore systems via application-aware memory channel partitioning
Main memory is a major shared resource among cores in a multicore system. If the interference between different applications' memory requests is not controlled effectively, system performance can
OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance
TLDR
This paper presents a coordinated CTA-aware scheduling policy that utilizes four schemes to minimize the impact of long memory latencies, and indicates that the proposed mechanism can provide 33% average performance improvement compared to the commonly-employed round-robin warp scheduling policy.
Dynamic management of scratch-pad memory space
TLDR
A compiler-controlled dynamic on-chip scratch-pad memory (SPM) management framework that uses both loop and data transformations is proposed that indicates significant reductions in data transfer activity between SPM and off-chip memory.
ICR: in-cache replication for enhancing data cache reliability
TLDR
This paper proposes a novel solution to this problem by allowing in-cache replication, wherein reliability can be enhanced without excessively slowing down cache accesses or requiring significant area cost increases.
...
...