Learn More
The historical context surrounding the birth of the DARPA High Productivity Computing Systems (HPCS) program is important for understanding why federal government agencies launched this new, long-term high performance computing program and renewed their commitment to leadership computing in support of national security, large science, and space requirements(More)
The use of GPUs to accelerate the factoring of large sparse symmetric indefinite matrices shows the potential of yielding important benefits to a large group of widely used applications. This paper examines how a multifrontal sparse solver performs when exploiting both the GPU and its multi-core host. It demonstrates that the GPU can dramatically accelerate(More)
We present a scalable parallelization scheme for high-order stencil computations that also optimizes memory behavior on multicore clusters. Our multilevel approach combines: (i) inter-node parallelization via spatial decomposition; (ii) inter-core parallelization via multithreading and explicit non-uniform memory access (NUMA) control; (iii) data locality(More)
In this paper, we describe a compilation system that automates much of the process of performance tuning that is currently done manually by application programmers interested in high performance. Due to the growing complexity of accurate performance prediction, our system incorporates empirical techniques to execute variants of code segments with(More)
In just one decade, the 1990s, supercomputer centers have undergone two fundamental transitions which require rethinking their operation and their role in high performance computing. The first transition in the early to mid-1990s resulted from a technology change in high performance computing architecture. Highly parallel distributed memory machines built(More)