Learn More
A large number of MPI implementations are currently available , each of which emphasize different aspects of high-performance computing or are intended to solve a specific research problem. The result is a myriad of incompatible MPI implementations, all of which require separate installation, and the combination of which present significant logistical(More)
As high-performance clusters continue to grow in size and popularity, issues of fault tolerance and reliability are becoming limiting factors on application scalability. To address these issues, we present the design and implementation of a system for providing coordinated checkpointing and rollback recovery for MPI-based parallel applications. Our approach(More)
As supercomputers grow, understanding their behavior and performance has become increasingly challenging. New hurdles in scalability, programmability, power consumption, reliability, cost, and cooling are emerging, along with new technologies such as 3D integration, GP-GPUs, silicon-photonics, and other "game changers". Currently, they HPC community lacks a(More)
The growth in the number of generally available, distributed , heterogeneous computing systems places increasing importance on the development of user-friendly tools that enable application developers to efficiently use these resources. Open MPI provides support for several aspects of heterogeneity within a single, open-source MPI implementation. Through(More)
Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently , Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the MPI library to manage message matching, which is sub-optimal for some(More)
Component architectures provide a useful framework for developing an extensible and maintainable code base upon which large-scale software projects can be built. Component methodologies have only recently been incorporated into applications by the High Performance Computing community, in part because of the perception that component architectures(More)
Power and energy concerns are motivating chip manufacturers to consider future hybrid-core processor designs that combine a small number of traditional cores optimized for single-thread performance with a large number of simpler cores optimized for throughput performance. This trend is likely to impact the way compute resources for network protocol(More)