Mark Heffernan

Learn More
This paper presents a new approach to local instruction scheduling based on integer programming that produces optimal instruction schedules in a reasonable time, even for very large basic blocks. The new approach first uses a set of graph transformations to simplify the data-dependency graph while preserving the optimality of the final schedule. The(More)
This paper presents a set of efficient graph transformations for local instruction scheduling. These transformations to the data-dependency graph prune redundant and inferior schedules from the solution space of the problem. Optimally scheduling the transformed problems using an enumerative scheduler is faster and the number of problems solved to optimality(More)
This article presents the first optimal algorithm for trace scheduling. The trace is a global scheduling region used by compilers to exploit instruction-level parallelism across basic block boundaries. Several heuristic techniques have been proposed for trace scheduling, but the precision of these techniques has not been studied relative to optimality. This(More)
The superblock is a scheduling region which exposes instruction level parallelism beyond the basic block through speculative execution of instructions. In gen- eral, scheduling superblocks is an NP-Hard optimiza- tion and prior work includes both heuristic (polynomial- time) and optimal (enumerative) scheduling techniques. This paper presents a set of(More)
Graphics Processing Units have emerged as powerful accelerators for massively parallel, numerically intensive workloads. The two dominant software models for these devices are NVIDIA’s CUDA and the cross-platform OpenCL standard. Until now, there has not been a fully open-source compiler targeting the CUDA environment, hampering general compiler and(More)
  • 1