• Publications
  • Influence
Software pipelining showdown: optimal vs. heuristic methods in a production compiler
TLDR
This comparison has indeed provided a quantitative validation of the SGI compiler's pipeliner, leading to increased confidence in both techniques, and is believed to be the first published measurement of runtime performance for ILP based generation of software pipelines. Expand
The multiflow trace scheduling compiler
TLDR
TheMultiflow compiler is described and reports on the Multiflow practice and experience with compiling for instruction-level parallelism beyond basic blocks are reported on. Expand
Highly Scalable Near Memory Processing with Migrating Threads on the Emu System Architecture
TLDR
A new, highly-scalable PGAS memory-centric system architecture where migrating threads travel to the data they access, and a comparison of key parameters with a variety of today's systems, of differing architectures, indicates the potential advantages. Expand
Phase Ordering of Register Allocation and Instruction Scheduling
TLDR
This paper presents a unified approach to instruction scheduling and global (beyond basic blocks) register allocation and states that register allocation should be performed before scheduling. Expand
Parallel processing: a smart compiler and a dumb machine
TLDR
A new fine-grained parallel architecture and a compiler that together offer order-of-magnitude speedups for ordinary scientific code are developed. Expand
Mitochondrial DNA. I. Preparation and properties of mitochondrial DNA from chick liver.
TLDR
It is shown that the renaturation rate of mitochondrial DNA is in good agreement with the earlier suggestion that the total genetic information in the mitochondrial population of chick liver is that contained in a double-stranded DNA molecule with a molecular weight of 10·106–11·106. Expand
Parallel processing: a smart compiler and a dumb machine
TLDR
A new fine-grained parallel architecture and a compiler that together offer order-of-magnitude speedups for ordinary scientific code are developed. Expand
Parallel processing: a smart compiler and a dumb machine
Lifting the restriction of aggregate data motion in parallel processing
TLDR
Experimental results are presented which show a code generated by the authors to be competitive with hand written code for alignment networks with restricted control, and an alternative method for controlling alignment networks: unrestricted individual control over the data interconnects. Expand
...
1
2
...