Michael Oberg

Learn More
In this paper, we examine parallel filesystems for shared deployment across multiple Linux clusters running with different hardware architectures and operating systems. Specifically, we deploy PVFS2, GPFS, Lustre, and TerraFS in our test environment containing Intel Xeon, Intel x86-64, and IBM PPC970 systems. We comment on the feature sets of each(More)
While much high-performance computing is performed using massively parallel MPI applications, many workflows execute jobs with a mix of processor counts. At the extreme end of the scale, some workloads consist of large quantities of single-processor jobs. These types of workflows lead to inefficient usage of massively parallel architectures such as the IBM(More)
Remote Direct Memory Access (RDMA) is an effective technology for reducing system load and improving performance. Recently, Ethernet offerings that exploit RDMA technology have become available that can potentially provide a high-performance fabric for MPI communications at lower cost than other competing technologies. The goal of this paper is to evaluate(More)
With the advent of Grid computing technology and the continued improvements to high-performance network infrastructure, computational science in distributed computing environments has become an essential research platform for scientists that require access to distributed computational resources, scientific data archives, or Grid-enabled scientific(More)
The design and procurement of supercomputers may require months, but the construction of a facility to house a supercomputer can extend to years. This paper describes the design and construction of a Top-50 supercomputer system and a fully-customized pre-fabricated facility to house it. The use of a co-design process reduced the time from conception to(More)
In this paper, we examine the performance of two components of the NCAR Community Climate System Model (CCSM) executing on clusters with a variety of microprocessor architectures and interconnects. Specifically, we examine the execution time and scalability of the Community Atmospheric Model (CAM) and the Parallel Ocean Program (POP) on Linux clusters with(More)
The University of Colorado (CU) and the National Center for Atmospheric Research (NCAR) have been deploying complimentary and federated resources supporting computational science in the Western United States since 2004. This activity has expanded to include other partners in the area, forming the basis for a broader Front Range Computing Consortium (FRCC).(More)
This paper presents an architecture for service hosting on virtual clusters spanning multiple administrative domains that balances the requirements of application developers and resource provider system administrators. The presented architecture and implementation use virtual machines to simplify the deployment of externally-accessible persistent Web and(More)
Reducing the complexity of the hardware and software components of Linux cluster systems can significantly improve management infrastructure scalability. Moving parts, in particular hard drives, generate excess heat and have the highest failure rates among cluster node components. The use of diskless nodes simplifies deployment and management, improves(More)
  • 1