Aneesh Aggarwal

Learn More
In the evolving submicron technology, making it particularly attractive to use decentralized designs. A common form of decentralization adopted in processors is to partition the execution core into multiple clusters. Each cluster has a small instruction window, and a set of functional units. A number of algorithms have been proposed for distributing(More)
We present an extension of field analysis (sec [4]) called <i>related field analysis</i> which is a general technique for proving relationships between two or more fields of an object. We demonstrate the feasibility and applicability of related field analysis by applying it to the problem of removing array bounds checks. For array bounds check removal, we(More)
With reducing feature size, increasing chip capacity, and increasing clock speed, microprocessors are becoming increasingly susceptible to transient (soft) errors. Redundant multi-threading (RMT) is an attractive approach for concurrent error detection and recovery. However, redundant threads significantly increase the pressure on the processor resources,(More)
Software prefetching and locality optimizations are techniques for overcoming the speed gap between processor and memory. In this paper, we evaluate the impact of memory trends on the effectiveness of software prefetching and locality optimizations for three types of applications: regular scientific codes, irregular scientific codes, and pointer-chasing(More)
Software prefetching and locality optimizations are techniques for overcoming the speed gap between processor and memory. In this paper, we provide a comprehensive summary of current software prefetching and locality optimization techniques, and evaluate the impact of memory trends on the effectiveness of these techniques for three types of applications:(More)
Violations in memory references cause tremendous loss of productivity, catastrophic mission failures, loss of privacy and security, and much more. Software mechanisms to detect memory violations have high false positive and negative rates or huge performance overhead. This paper proposes architectural support to detect memory reference violations in(More)
We propose a series of aggressive register deallocation mechanisms to reduce the register file pressure and increase the parallelism exploited by superscalar microprocessors. Our techniques are based on a key observation that a register value can be temporarily decoupled from the register identifier. Specifically, even if a physical register is deallocated,(More)