James A. Jablin

Learn More
The performance benefits of GPU parallelism can be enormous, but unlocking this performance potential is challenging. The applicability and performance of GPU parallelizations is limited by the complexities of CPU-GPU communication. To address these communications problems, this paper presents the first fully automatic system for managing and optimizing(More)
GPUs are flexible parallel processors capable of accelerating real applications. To exploit them, programmers must ensure a consistent program state between the CPU and GPU memories by managing data. Manually managing data is tedious and error-prone. In prior work on automatic CPU-GPU data management, alias analysis quality limits performance, and(More)
Core-to-core communication bandwidth is critical for high-performance pipeline-parallel programs. Hardware communication queues are unlikely to be implemented and are perhaps unnecessary. This paper presents Liberty Queues, a high-performance lock-free software-only ring buffer, and describes the porting effort from the original x86-64 implementation to(More)
GPU performance depends not only on thread/warp level parallelism (TLP) but also on instruction-level parallelism (ILP). It is not enough to schedule instructions within basic blocks, it is also necessary to exploit opportunities for ILP optimization beyond branch boundaries. Unfortunately, modern GPUs cannot dynamically carry out such optimizations because(More)
Researchers must often write their own simulation and analysis software. During this process they simultaneously confront both computational and scientific problems. Current strategies for aiding the generation of performance-oriented programs do not abstract the software development from the science. Furthermore, the problem is becoming increasingly(More)
Many researchers test graph scheduling algorithms on “random” graphs. We show that many of the random graphs used in the literature are not representative of actual programs by comparing random graphs to graphs derived from the SPECINT2000 benchmarks. We introduce Ragnarok (RAndom Graphs Never ARe OK), a new graph benchmark suite embodying characteristics(More)
  • 1