Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space Exploration

Jie Xin,Xiaofei Liao,Hai Jin,Linchen Yu,Yu Huang,Qinggang Wang,Xianqi Ye,Pengcheng Yao,Long Zheng

doi:10.1109/hpec49654.2021.9622791

Abstract

Deep neural networks (DNN) have been widely used in many fields. With the ever-increasing model size, the DNN scalability suffers. Sparse deep neural networks (SpDNN) are promising to resolve this problem, but the sparse data makes it difficult to execute efficiently on GPUs due to load imbalance and irregular memory accesses. The recent MIT/IEEE/Amazon GraphChallenge has shown several big advances to fit sparse DNNs into GPUs, but we observe that none of these earlier efforts can be an absolute winner for all dataset cases due to their limited optimization space considerations. In this paper, we identify some new opportunities in optimizing the execution of SpDNN via a comprehensive analysis of previous works. Based on this new large design space of SpDNN, we present sparsity-aware SpMM algorithms that can systematically explore a performance-optimal solution of SpDNN execution on GPU and further generate optimized SpMM kernel implementations. Compared to the 2020 HPEC Sparse DNN Challenge champions, our approach achieves up to 55.6 TeraEdges per second inference throughput with the speedups of up to 13.74_ [1] and 22.29_ [2] on a single NVIDIA V100 GPU. We also show that our approach under 4 GPUs can be superior to the 2020 Challenge Champion using 768 GPUs in many cases. The source codes are available at https://github.com/CGCL-codes/Graphchallenge21.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space Exploration

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Towards Fast GPU-based Sparse DNN Inference: A Hybrid Compute Model
Shaoxian Xu ... Zhiyuan Shao
-
Shaoxian Xu, et. al.Shaoxian Xu ... Zhiyuan Shao
19 Sep 2022
19 Sep 2022

At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation
Mert Hidayetoglu ... Eiman Ebrahimi
-
Mert Hidayetoglu, et. al.Mert Hidayetoglu ... Eiman Ebrahimi
22 Sep 2020
22 Sep 2020

Performance of Training Sparse Deep Neural Networks on GPUs
Jianzong Wang ... Jing Xiao
-
Jianzong Wang, et. al.Jianzong Wang ... Jing Xiao
01 Sep 2019
01 Sep 2019

Sparse Deep Neural Network Acceleration on HBM-Enabled FPGA Platform
Abhishek Kumar Jain ... Dinesh Gaitonde
-
Abhishek Kumar Jain, et. al.Abhishek Kumar Jain ... Dinesh Gaitonde
20 Sep 2021
20 Sep 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space Exploration

Abstract

Talk to us

Similar Papers