Learn More
Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. While Moore's law ensures that the computational power of high-performance computing systems increases with every generation, the same is not true for their I/O subsystems. The scalability challenges faced by existing parallel(More)
Unmatched computation and storage performance in new HPC systems have led to a plethora of I/O optimizations ranging from application-side collective I/O to network and disk-level request scheduling on the file system side. As we deal with ever larger machines, the interference produced by multiple applications accessing a shared parallel file system in a(More)
While network bandwidth is steadily increasing, it is doing so at a much slower rate than the corresponding increase in CPU performance. This trend has widened the gap between CPU and network speed. In this paper, we investigate improvements to I/O performance by exploiting this gap. We harness idle CPU resources to compress network traffic, reducing the(More)
State-of-the-art yet decades old architecture of high performance computing (HPC) systems has its compute and storage resources separated. It has shown limits for today’s dataintensive scientific applications, because every I/O needs to be transferred via the network between the compute and storage cliques. This paper proposes a distributed storage layer(More)
Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. I/O forwarding is a paradigm that attempts to bridge the increasing performance and scalability gap between the compute and I/O components of leadership-class machines to meet the requirements of data-intensive applications by(More)
HPC platforms are capable of generating huge amounts of metadata about different entities including jobs, users, and files. <i>Simple metadata</i>, which describe the attributes of these entities (e.g., file size, name, and permissions mode), has been well recorded and used in current systems. However, only a limited amount of <i>rich metadata</i>, which(More)
State-of-the-art, yet decades-old, architecture of high-performance computing systems has its compute and storage resources separated. It thus is limited for modern data-intensive scientific applications because every I/O needs to be transferred via the network between the compute and storage resources. In this paper we propose an architecture that hss a(More)
The state-of-the-art storage architecture of high-performance computing systems was designed decades ago, and with today's scale and level of concurrency, it is showing significant limitations. Our recent work proposed a new architecture to address the I/O bottleneck of the conventional wisdom, and the system prototype (FusionFS) demonstrated its(More)
I/O is the critical bottleneck for data-intensive scientific applications on HPC systems and leadership-class machines. Applications running on these systems may encounter bottlenecks because the I/O systems cannot handle the overwhelming intensity and volume of I/O requests. Applications and systems use I/O forwarding to aggregate and delegate I/O requests(More)