Learn More
With up to 65,536 compute nodes and a peak performance of more than 360 teraflops, the Blue Genet/L (BG/L) supercomputer represents a new level of massively parallel systems. The system software stack for BG/L creates a programming and operating environment that harnesses the raw power of this architecture with great effectiveness. The design and(More)
Blue Gene/L is currently the world's fastest and most scalable supercomputer. It has demonstrated essentially linear scaling all the way to 131,072 processors in several benchmarks and real applications. The operating systems for the compute and I/O nodes of Blue Gene/L, are among the components responsible for that scalability. Compute nodes are dedicated(More)
Traditionally, there have been two approaches to providing an operating environment for high performance computing (HPC). A Full-Weight Kernel(FWK) approach starts with a general-purpose operating system and strips it down to better scale up across more cores and out across larger clusters. A Light-Weight Kernel (LWK) approach starts with a new thin kernel(More)
The Petascale era has recently been ushered in and many researchers have already turned their attention to the challenges of exascale computing. To achieve petascale computing two broad approaches for kernels were taken, a lightweight approach embodied by IBM Blue Gene's CNK, and a more fullweight approach embodied by Cray's CNL. There are strengths and(More)
Linux<sup>&#174;</sup>, or more specifically, the Linux API, plays a key role in HPC computing. Even for extreme-scale computing, a known and familiar API is required for production machines. However, an off-the-shelf Linux distribution faces challenges at extreme scale. To date, two approaches have been used to address the challenges of providing an(More)
The Blue Gene/L system at the Department of Energy Lawrence Livermore National Laboratory in Livermore, California is the world’s most powerful supercomputer. It has achieved groundbreaking performance in both standard benchmarks as well as real scientific applications. In that process, it has enabled new science that simply could not be done before. Blue(More)
TLB misses have been considered an important source of system overhead and one of the causes that limit scalability on large supercomputers. This assumption lead to HPC lightweight kernel designs that usually statically map page table entries to TLB entries and do not take TLB misses. While this approach worked for petascale clusters, programming and(More)
Traditional full-featured operating systems are known to have properties that limit the scalability of distributed memory parallel programs, the most common programming paradigm utilized in high end computing. Furthermore, as processor counts increase with the most capable systems, the necessary activity to manage the system becomes more of a burden. To(More)