Learn More
—Attaining performance in the evaluation of two-electron repulsion integrals and constructing the Fock matrix is of considerable importance to the computational chemistry community. Due to its numerical complexity improving the performance behavior across a variety of leading supercomputing platforms is an increasing challenge due to the significant(More)
The largest supercomputers in the world today consist of hundreds of thousands of processing cores and many more other hardware components. At such scales, hardware faults are a commonplace, necessitating fault-resilient software systems. While different fault-resilient models are available, most focus on allowing the computational processes to survive(More)
In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules,(More)
The Hartree-Fock (HF) method is the fundamental first step for incorporating quantum mechanics into many-electron simulations of atoms and molecules, and it is an important component of computational chemistry toolkits like NWChem. The GTFock code is an HF implementation that, while it does not have all the features in NWChem, represents crucial algorithmic(More)
Large scientific code bases are often composed of several layers of runtime libraries, implemented in multiple programming languages. In such situation, programmers often choose conservative synchronization patterns leading to suboptimal performance. In this paper, we present context-sensitive dynamic optimizations that elide barriers redundant during the(More)
In this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage of the NWChem computational chemistry code. The analyses handle dynamically load balanced, well optimized code in a runtime and programming model independent manner. Our main insight is that noise is intrinsic to large scale executions and it appears whenever(More)
In this paper we present optimizations that use DVFS mechanisms to reduce the total energy usage in scientific applications. Our main insight is that noise is intrinsic to large scale parallel executions and it appears whenever shared resources are contended. The presence of noise allows us to identify and manipulate any program regions amenable to DVFS.(More)
—Component failures in high-end systems are increasingly a norm rather than an exception. While application-transparent and application-aware approaches for check-point/restart have been proposed in the literature, they cease to scale beyond a few hundred nodes. Although, SSDs/NVRAMs alleviate the check-pointing overhead, the cost of recovery from failures(More)
  • 1