Learn More
Rapid increases in computing and communication performance are exacerbating the long-standing problem of performance-limited input/output. Indeed, for many otherwise scalable parallel applications. input/output is emerging as a major performance bottleneck. The design of scalable input/output systems depends critically on the input/output requirements and(More)
TeraGrid is a national-scale computational science facility supported through a partnership among thirteen institutions, with funding from the US National Science Foundation [1]. Initially created through a Major Research Equipment Facilities Construction (MREFC [2]) award in 2001, the TeraGrid facility began providing production computing, storage,(More)
This paper describes the program execution framework being developed by the Grid Application Development Software (GrADS) Project. The goal of this framework is to provide good resource allocation for Grid applications and to support adaptive reallocation if performance degrades because of changes in the availability of Grid resources. At the heart of this(More)
We present an auto-tuning system for optimizing I/O performance of HDF5 applications and demonstrate its value across platforms, applications, and at scale. The system uses a genetic algorithm to search a large space of tunable parameters and to identify effective settings at all layers of the parallel I/O stack. The parameter settings are applied(More)
A large and important class of national challenge applications are irregular, with complex, data dependent execution behavior, and dynamic, with time varying resource demands. We believe the solution to the performance optimization conundrum is integration of dynamic performance instrumentation and on-they performance data reduction with con-gurable,(More)
—I/O has become one of the determining factors of HPC application performance. Understanding an application's I/O activity requires a multi-level view of the I/O function flow that includes high-level I/O libraries. We have developed a tracing framework, called Recorder, that captures I/O function calls at multiple layers of the parallel I/O stack without(More)
The modern parallel I/O stack consists of several software layers with complex inter-dependencies and performance characteristics. While each layer exposes tunable parameters, it is often unclear to users how different parameter settings interact with each other and affect overall I/O performance. As a result, users often resort to default system settings,(More)