Meenakshi A. Kandaswamy

  • Citations Per Year
Learn More
ÐThis paper presents a unified framework that optimizes out-of-core programs by exploiting locality and parallelism, and reducing communication overhead. For out-of-core problems where the data set sizes far exceed the size of the available in-core memory, it is particularly important to exploit the memory hierarchy by optimizing the I/O accesses. We(More)
This paper presents compiler algorithms to optimize outof-core programs. These algorithms consider loop and data layout transformations in a tied framework. The performance of an out-of-core loop nest containing many references can be improved by a combination of restructuring the loops and file layouts. This approach considers array references one-by-one(More)
Many large scale applications, have significant I/O requirements as well as computational and memory requirements. Unfortunately, limited number of I/O nodes provided by the contemporary messagepassing distributed-memory architectures such as Intel Paragon and IBM SP-2 limits the I/O performance of these applications severely. In this paper, we examine some(More)
The use of parallel machines to solve large scale computational problems in science and engineering has increased considerably in recent times. Many of these problems have computational requirements which stretch the capabilities of even the fastest machine available today. In addition to requiring a great deal of computational power, these problems usually(More)
Many large scale applications have significant I/O requirements as well as computational and memory requirements. Unfortunately, the limited number of I/O nodes provided in a typical configuration of the modern message-passing distributed-memory architectures such as Intel Paragon and IBM SP-2 limits the I/O performance of these applications severely. In(More)
Parallel machines are an important part of the scientific application developer's tool box and the processing demands placed on these machines are rapidly increasing. Many scientific applications tend to perform high volume data storage, data retrieval and data processing, which demands high performance from the I/O subsystem. In this paper, we conduct an(More)
Many scientific applications tend to perform high-volume data storage, data retrieval, and data processing, all of which demand high performance from the I/O subsystem. The focus and contribution of this work is to study the I/O behavior of the Hartree-Fock (HF) method using PASSION. HF’s I/O phases can contribute up to 62.34% of the total execution time.(More)
Generally, parallel scienti c applications are executed on a xed number of processors determined to be optimal by an e ciency analysis of the application's computational kernel. It is well-known, however, that the degree of parallelism found in di erent parts of an application varies. In this paper, we present the results of an in-depth study quantifying(More)
  • 1