Learn More
State-of-the-art, yet decades-old, architecture of high-performance computing systems has its compute and storage resources separated. It thus is limited for modern data-intensive scientific applications because every I/O needs to be transferred via the network between the compute and storage resources. In this paper we propose an architecture that hss a(More)
Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. While Moore's law ensures that the computational power of high-performance computing systems increases with every generation, the same is not true for their I/O subsystems. The scalability challenges faced by existing parallel(More)
While network bandwidth is steadily increasing, it is doing so at a much slower rate than the corresponding increase in CPU performance. This trend has widened the gap between CPU and network speed. In this paper, we investigate improvements to I/O performance by exploiting this gap. We harness idle CPU resources to compress network traffic, reducing the(More)
Unmatched computation and storage performance in new HPC systems have led to a plethora of I/O optimizations ranging from application-side collective I/O to network and disk-level request scheduling on the file system side. As we deal with ever larger machines, the interference produced by multiple applications accessing a shared parallel file system in a(More)
—High-performance computing (HPC) and distributed systems rely on a diverse collection of system software to provide application services, including file systems, schedulers, and web services. Such system software services must manage highly concurrent requests, interact with a wide range of resources, and scale well in order to be successful.(More)
Parallel computing is indisputably present in the future of high performance computing. For distributed memory systems, MPI is widely accepted as a de facto standard. However, I/O is often neglected when considering parallel performance. In this article, a number of I/O strategies for distributed memory systems will be examined. These will be evaluated in(More)
The state-of-the-art storage architecture of high-performance computing systems was designed decades ago, and with today's scale and level of concurrency, it is showing significant limitations. Our recent work proposed a new architecture to address the I/O bottleneck of the conventional wisdom, and the system prototype (FusionFS) demonstrated its(More)
Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. I/O forwarding is a paradigm that attempts to bridge the increasing performance and scalability gap between the compute and I/O components of leadership-class machines to meet the requirements of data-intensive applications by(More)
—State-of-the-art yet decades old architecture of high performance computing (HPC) systems has its compute and storage resources separated. It has shown limits for today's data-intensive scientific applications, because every I/O needs to be transferred via the network between the compute and storage cliques. This paper proposes a distributed storage layer(More)
I/O is the critical bottleneck for data-intensive scientific applications on HPC systems and leadership-class machines. Applications running on these systems may encounter bottlenecks because the I/O systems cannot handle the overwhelming intensity and volume of I/O requests. Applications and systems use I/O forwarding to aggregate and delegate I/O requests(More)