Learn More
Cloud resources promise to be an avenue to address new categories of scientific applications including data-intensive science applications, on-demand/surge computing, and applications that require customized software environments. However, there is a limited understanding on how to operate and use clouds for scientific applications. Magellan, a project(More)
Since clusters were first introduced[5], node counts have increased rapidly. Currently, a variety of clusters with more than one thousand nodes are listed on the TOP500 list. In the next three years, clusters with more than four thousand nodes are expected. Cluster management functionality has lagged behind all areas of system software. In order to(More)
We describe the use of component architecture in an area to which this approach has not been classically applied, the area of cluster system software. By "cluster system software," we mean the collection of programs used in configuring and maintaining individual nodes, together with the software involved in submission, scheduling, monitoring, and(More)
While previous work has shown MPI to provide capabilities for system software, actual adoption has not widely occurred. We discuss process management shortcomings in MPI implementations and their impact on MPI usability for system software and management tasks. We introduce MPISH, a parallel shell designed to address these issues.
Petascale HPC systems are among the largest systems in the world. Intrepid, one such system, is a 40,000 node, 556 teraflop Blue Gene/P system that has been deployed at Argonne National Laboratory. In this paper, we provide some background about the system and our administration experiences. In particular, due to the scale of the system, we have faced a(More)
While configuration management systems are generally regarded as useful, their deployment process is not well understood or documented. In this paper, we present a case study in configuration management tool deployment. We describe the motivating factors and both the technical considerations and the social issues involved in this process. Our discussion(More)
With the recent trend of exploiting resources of the cloud, we have embarked on a journey to deploy an open source cloud using Eucalyptus 1. During the past year we have learned many lessons about the use of Eucalyptus and clouds in general. The area of security provides significant challenges in operating a cloud, the scalability supposedly inherent in(More)