# Efficient Inference on GPUs for the Sparse Deep Neural Network Graph Challenge 2020

@article{Hidayetoglu2020EfficientIO, title={Efficient Inference on GPUs for the Sparse Deep Neural Network Graph Challenge 2020}, author={Mert Hidayetoglu and Carl Pearson and Vikram Sharma Mailthody and Eiman Ebrahimi and Jinjun Xiong and R. Nagi and W. Hwu}, journal={ArXiv}, year={2020}, volume={abs/2007.14152} }

This paper presents GPU performance optimization and scaling results for the Sparse Deep Neural Network Challenge 2020. Demands for network quality have increased rapidly, pushing the size and thus the memory requirements of many neural networks beyond the capacity of available accelerators. Sparse deep neural networks (SpDNN) have shown promise for reigning in the memory footprint of large neural networks. However, there is room for improvement in implementing SpDNN operations on GPUs. This… Expand

#### References

SHOWING 1-10 OF 22 REFERENCES

Scalable Inference for Sparse Deep Neural Networks using Kokkos Kernels

- Computer Science
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

This work bases their sparse network for DNNs, KK-SpDNN, on the sparse linear algebra kernels within the Kokkos Kernels library, and uses the sparse matrix-matrix multiplication in Kok Kos Kernels to reuse a highly optimized kernel. Expand

Performance of Training Sparse Deep Neural Networks on GPUs

- Computer Science
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

A Fine-tune Structured Sparsity Learning (FSSL) method to regularize the structures of DNNs and accelerate the training of Dnns is proposed and results show that superior performance and efficiency than the Matlab example code. Expand

Accelerating DNN Inference with GraphBLAS and the GPU

- Computer Science
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

This work addresses the 2019 Sparse Deep Neural Network Graph Challenge with an implementation of this challenge using the GraphBLAS programming model. We demonstrate our solution to this challenge… Expand

Sparse Deep Neural Network Graph Challenge

- Computer Science, Mathematics
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

The proposed Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective of emerging sparse AI systems. Expand

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

- Computer Science
- NeurIPS
- 2019

GPipe is introduced, a pipeline parallelism library that allows scaling any network that can be expressed as a sequence of layers by pipelining different sub-sequences of layers on separate accelerators, resulting in almost linear speedup when a model is partitioned across multiple accelerators. Expand

A GPU Implementation of the Sparse Deep Neural Network Graph Challenge

- Computer Science
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

A CUDA implementation of the latest addition to the Graph Challenge, the inference computation on a collection of large sparse deep neural networks using the managed memory API available in CUDA allows for simple and efficient distribution of these computations across a multiGPU NVIDIA DGX-2 server. Expand

GraphChallenge.org Sparse Deep Neural Network Performance

- Computer Science, Mathematics
- 2020 IEEE High Performance Extreme Computing Conference (HPEC)
- 2020

These submissions show that their state-of-the-art sparse DNN execution time, TDNN, is a strong function of the number of DNN operations performed, Nop, and underscores the need for new innovations to achieve high performance on very large sparseDNNs. Expand

RadiX-Net: Structured Sparse Matrices for Deep Neural Networks

- Computer Science, Mathematics
- 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
- 2019

An algorithm is presented that deterministically generates RadiX-Nets: sparse DNN topologies that, as a whole, are much more diverse than X-Net topologies, while preserving X-Net's desired characteristics. Expand

Write Quick, Run Fast: Sparse Deep Neural Network in 20 Minutes of Development Time via SuiteSparse:GraphBLAS

- Computer Science
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

SuiteSparse:GraphBLAS is a full implementation of the GraphBLAS standard, which provides a powerful and expressive framework for creating graph algorithms based on the elegant mathematics of sparse… Expand

Update on Triangle Counting on GPU

- Computer Science
- 2019 IEEE High Performance Extreme Computing Conference (HPEC)
- 2019

This work presents an update to the triangle-counting portion of the subgraph isomorphism static graph challenge and improves the single-GPU kernel performance by introducing a work-stealing dynamic algorithm GPU kernel with persistent threads, which makes performance adaptive for large graphs without requiring a graph analysis phase. Expand