Sandeep Chandran

Learn More
With an increasing number of cores per chip, it is becoming harder to guarantee optimal performance for parallel shared memory applications due to interference caused by kernel threads, interrupts, bus contention, and temperature management schemes (referred to as jitter). We demonstrate that the performance of parallel programs gets reduced (up to 35.22(More)
Barriers have long been recognized as important performance-critical constructs in parallel applications. As a consequence, researchers have proposed fast implementations of barriers in both traditional electrical networks and in non-conventional networks such as optical NoCs. We prove in this paper that current protocols for barriers in optical NoCs are(More)
—In this report, we report some fundamental results and bounds on the number of messages and storage required to implement barriers using futuristic on-chip optical and RF networks. We prove that it is necessary to maintain a count to at least N (number of threads) in memory, broadcast the barrier id at least once, and if we elect a coordinator , we can(More)
On-chip trace buffers are increasingly being used for at-speed debug during post-silicon validation. However, the activity history captured by these buffers is small due to their limited size. We propose a novel scheme that extends the captured trace history (by upto 162%) by using a portion of the trace buffer to also store summaries of trace messages. We(More)
— The internal state of the complex modern processors often needs to be dumped out frequently during postsilicon validation. Since the caches hold most of the state, the volume of data dumped and the transfer time are dominated by the large caches present in the architecture. The limited bandwidth to transfer data present in these large caches off-chip(More)
Image content clustering is an effective way to organize large databases thereby making the content based image retrieval process much easier. However, clustering of images with varied background and foreground is quite challenging. In this paper, we propose a novel image content clustering paradigm suitable for clustering large and diverse image databases.(More)
The advancements in the field of internet and cloud computing has resulted in a huge amount of multimedia data and processing of this data have become more complex and computationally intensive. As a result, it has become very challenging for image retrieval algorithms to efficiently extract useful information from these data. Local Derivative Pattern (LDP)(More)
The internal state of complex modern processors often needs to be dumped out frequently during post-silicon validation. Since the last level cache (considered L2 in this paper) holds most of the state, the volume of data dumped and the transfer time are dominated by the L2 cache. The limited bandwidth to transfer data off-chip coupled with the large size of(More)
— On-chip trace buffers are increasingly being used for at-speed debug during postsilicon validation. The limited size of these buffers results in their frequent overflowing. In scenarios when such overflowing is not desirable, the chip is stalled, and the state data recorded in these buffers are transferred off-chip. Such frequent stalling significantly(More)
  • 1