Share This Author
Optimizing Compilers for Modern Architectures: A Dependence-based Approach
A broad introduction to data dependence, to the many transformation strategies it supports, and to its applications to important optimization problems such as parallelization, compiler memory hierarchy management, and instruction scheduling are provided.
Automatic translation of FORTRAN programs to vector form
The theoretical background is developed here for employing data dependence to convert FORTRAN programs to parallel form and transformations that use dependence to uncover additional parallelism are discussed.
Compiling Fortran D for MIMD distributed-memory machines
This work proposes to solve the problem of programming parallel machines by developing the compiler technology needed to establish a machine-independent programming model that must be easy to use, yet perform with acceptable efficiency on different parallel architectures, at least for data-parallel scientific codes.
Conversion of control dependence to data dependence
- John R. Allen, K. Kennedy, Carrie Porterfield, J. Warren
- Computer ScienceACM-SIGACT Symposium on Principles of Programming…
- 24 January 1983
This paper presents a method for systematically converting control dependences to data dependences in this fashion by eliminating goto statements and introducing logical variables to control the execution of statements in the program.
An Implementation of Interprocedural Bounded Regular Section Analysis
The experimental results demonstrate that regular section analysis is an effective means of discovering parallelism, given programs written in an appropriately modular programming style.
Practical dependence testing
Exact yet fast dependence tests are presented for certain classes ofarray references, as well as empirical results showing that these references dominate scientific Fortran codes.
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
- K. Kennedy, K. McKinley
- Computer ScienceInternational Workshop on Languages and Compilers…
- 12 August 1993
A new algorithm for fusing a collection of parallel and sequential loops, minimizing parallel loop synchronization while maximizing parallelism; a proof that performing fusion to maximize data locality is NP-hard; and two polynomial-time algorithms for improving data locality.
Improving cache performance in dynamic applications through data and computation reorganization at run time
It is demonstrated that run-time program transformations can substantially improve computation and data locality and, despite the complexity and cost involved, a compiler can automate such transformations, eliminating much of the associated run- time overhead.
Automatic data layout for distributed-memory machines
The proposed framework for automatic data layout selection builds and examines search spaces of candidate data layouts and capitalizes on state-of-the-art 0-1 integer programming technology to compute optimal solutions of these NP-complete problems.
PFC: A Program to Convert Fortran to Parallel Form