Allen Leung

Learn More
Programmers for GPGPU face rapidly changing substrate of programming abstractions, execution models, and hardware implementations. It has been established, through numerous demonstrations for particular conjunctions of application kernel, programming languages, and GPU hardware instance, that it is possible to achieve significant improvements in the(More)
In this work we investigate the problem of scheduling instructions on idealized microprocessors with multiple pipelines, in the presence of precedence constraints, release-times, deadlines, and latency constraints. A latency of <italic>l<subscrpt>ij</subscrpt></italic> specifies that there must be at least <italic>l<subscrpt>ij</subscrpt></italic>(More)
Instruction scheduling is central to achieving performance in modern processors with instruction level parallelism (ILP). Classical work in this area has spanned the theoretical foundations of algorithms for instruction scheduling with provable optimality, as well as heuristic approaches with experimentally validated performance improvements. Typically, the(More)
Emerging computing architectures present concurrent, heterogeneous , and hierarchical organizations. Explicit management of distributed memories, bulk communications, and the careful scheduling of data and computation for locality of reference appear to be necessary to achieve high efficiencies relative to the peak performance. In some cases, the(More)