Learn More
I/O data access is a recognized performance bottleneck of high-end computing. Several commercial and research parallel file systems have been developed in recent years to ease the performance bottleneck. These advanced file systems perform well on some applications but may not perform well on others. They have not reached their full potential in mitigating(More)
Serotyping forms the basis of national and international surveillance networks for Salmonella, one of the most prevalent foodborne pathogens worldwide (1-3). Public health microbiology is currently being transformed by whole-genome sequencing (WGS), which opens the door to serotype determination using WGS data. SeqSero (www.denglab.info/SeqSero) is a novel(More)
Parallel file systems have become a common component of modern high-end computers to mask the ever-increasing gap between disk data access speed and CPU computing power. However, while working well for certain applications, current parallel file systems lack the ability to effectively handle concurrent I/O requests with data synchronization needs, whereas(More)
The performance gap between computing power and the I/O system is ever increasing, and in the meantime more and more High Performance Computing (HPC) applications are becoming data intensive. This study describes an I/O data replication scheme, named Pattern-Direct and Layout-Aware (PDLA) data replication scheme, to alleviate this performance gap. The basic(More)
Many scientific applications spend a significant portion of their execution time in accessing data from files. Various optimization techniques exist to improve data access performance, such as data prefetching and data layout optimization. However, optimization process is usually a difficult task due to the complexity involved in understanding I/O behavior.(More)
Parallel file systems are designed to mask the ever-increasing gap between CPU and disk speeds via parallel I/O processing. While they have become an indispensable component of modern high-end computing systems, their inadequate performance is a critical issue facing the HPC community today. Conventionally, a parallel file system stripes a file across(More)
In this study, the authors propose a simple performance model to promote a better integration between the parallel I/O middleware layer and parallel file systems. They show that application-specific data layout optimization can improve overall data access delay considerably for many applications. Implementation results under MPI-IO middleware and PVFS2 file(More)
Scientific computing is becoming more data-intensive; however I/O throughput is not growing at the same rate. MPI-IO and parallel file systems are expected to help bridge the gap by increasing data access parallelism. Compared to traditional I/O systems, some factors are more important in parallel I/O system in order to achieve better performance, such as(More)
Dynamic programming approach solves complex problems efficiently by breaking them down into simpler sub-problems, and is widely utilized in scientific computing. With the increasing data volume of scientific applications and development of multi-core/multi-processor hardware technologies, it is necessary to develop efficient techniques for parallelizing(More)
High performance computing are widely used for scientific discoveries by running scientific computation programs. Many of these applications are getting more and more data intensive [1]. They generate or access huge amount of data during some execution phases. However, traditional supercomputers are designed for computing-intensive tasks. They usually have(More)