Vaibhav Sundriyal

Learn More
Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. Many previous energy-saving strategies have focused solely on the CPU while the DRAM subsystem has not been addressed sufficiently, even though memory consumes about 20 % of the total power in a typical server platform. This paper describes a novel(More)
Although high-performance computing has always been about efficient application execution, both energy and power consumption have become critical concerns owing to their effect on operating costs and failure rates of large-scale computing platforms. Modern processors provide techniques, such as dynamic voltage and frequency scaling (DVFS) and CPU clock(More)
With the increase in the peak performance of modern computing platforms, their energy consumption grows as well, which may lead to overwhelming operating costs and failure rates. Techniques, such as Dynamic Voltage and Frequency Scaling (called DVFS) and CPU Clock Modulation (called throttling) are often used to reduce the power consumption of the compute(More)
The drive to extract peak performance from the modern computing platforms has lead to drastic increase in their energy and power consumption and thereby affecting the operating costs and failure rates. Modern processors provide techniques, such as dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (called throttling), to improve energy(More)
Modern high-performance computing system design is becoming increasingly aware of the energy proportional computing to lower the operational costs and raise reliability. At the same time, high-performance application developers are taking pro-active steps towards less energy consumption without a significant performance loss. One way to accomplish this is(More)
Accelerators are adopted to increase performance, reduce time-to-solution, and minimize energy-to-solution. However, employing them efficiently, given system and application characteristics, is often a daunting task. A goal of this work is to propose a general model that predicts performance and power requirements for an application, computational portions(More)
A heterogeneous cluster architecture is complex. It contains hundreds, or thousands of devices connected by a tiered communication system in order to solve a problem. As a heterogeneous system, these devices will have varying performance capabilities. To better understand the interactions which occur between the various devices during execution, an(More)