Analyzing CUDA workloads using a detailed GPU simulator

  title={Analyzing CUDA workloads using a detailed GPU simulator},
  author={Ali Bakhoda and George L. Yuan and Wilson W. L. Fung and Henry Wong and Tor M. Aamodt},
  journal={2009 IEEE International Symposium on Performance Analysis of Systems and Software},
Modern Graphic Processing Units (GPUs) provide sufficiently flexible programming models that understanding their performance can provide insight in designing tomorrow's manycore processors, whether those are GPUs or otherwise. The combination of multiple, multithreaded, SIMD cores makes studying these GPUs useful in understanding tradeoffs among memory, data, and thread level parallelism. While modern GPUs offer orders of magnitude more raw computing power than contemporary CPUs, many important… CONTINUE READING
Highly Influential
This paper has highly influenced 253 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 1,237 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 816 extracted citations

Optimizing Cache Bypassing and Warp Scheduling for GPUs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems • 2018
View 5 Excerpts
Highly Influenced

Shared Last-Level Cache Management and Memory Scheduling for GPGPUs with Hybrid Main Memory

ACM Trans. Embedded Comput. Syst. • 2018
View 8 Excerpts
Highly Influenced

A software technique to enhance register utilization of Convolutional Neural Networks on GPGPUs

2017 International Conference on Applied System Innovation (ICASI) • 2017
View 5 Excerpts
Highly Influenced

Application-Specific Autonomic Cache Tuning for General Purpose GPUs

2017 International Conference on Cloud and Autonomic Computing (ICCAC) • 2017
View 6 Excerpts
Highly Influenced

Characterization of Neural Network Backpropagation on Chiplet-based GPU Architectures

Colin A. Weinshenker
View 15 Excerpts
Highly Influenced

Characterizing convolutional neural network workloads on a detailed GPU simulator

2017 International SoC Design Conference (ISOCC) • 2017
View 9 Excerpts
Highly Influenced

Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs

IEEE Transactions on Parallel and Distributed Systems • 2017
View 9 Excerpts
Highly Influenced

1,238 Citations

Citations per Year
Semantic Scholar estimates that this publication has 1,238 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 31 references

CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptography

2007 IEEE International Conference on Signal Processing and Communications • 2007
View 11 Excerpts
Highly Influenced

NVIDIA CUDA Programming Guide

NVIDIA Corporation
1.1 edition • 2007
View 10 Excerpts
Highly Influenced

PTX: Parallel Thread Execution ISA

NVIDIA Corporation
1.1 edition • 2007
View 13 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…