Ruth Klundt

  • Citations Per Year
Learn More
In this paper, we present a tool to extract I/O traces from very large applications running at full scale during their production runs. We analyze these traces to gain information about the application. We analyze the traces of three applications. The analysis showed that the I/O traces reveal much information about the application even without access to(More)
This paper will summarize an IO performance analysis effort performed on Sandia National Laboratories Red Storm platform. Our goal was to examine the IO system performance and identify problems or bottle-necks in any aspect of the IO sub-system. Our process examined the entire IO path from application to disk both in segments and as a whole. Our final(More)
High-performance computing (HPC) storage systems rely on access coordination to ensure that concurrent updates do not produce incoherent results. HPC storage systems typically employ pessimistic distributed locking to provide this functionality in cases where applications cannot perform their own coordination. This approach, however, introduces significant(More)
This paper describes the implementation of a software architecture that was described in “An Extensible, Portable, Scalable Cluster Management Software Architecture”[4]. This implementation, named Cluster Integration Toolkit (CIToolkit, or just CIT), has been used to successfully integrate and support numerous cluster systems in production at Sandia(More)
This paper describes an object-oriented software architecture for cluster integration and management that enables extensibility, portability, and scalability. This architecture has been successfully implemented and deployed on several large-scale production clusters at Sandia National Laboratories, the largest of which is currently 1861 nodes. This paper(More)
  • 1