Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio

Xiao Liu,Lili Yao,Wenbin Li,Jing Huo,Yang Gao

doi:10.1609/aaai.v34i04.5927

Abstract

Deep neural network compression is important and increasingly developed especially in resource-constrained environments, such as autonomous drones and wearable devices. Basically, we can easily and largely reduce the number of weights of a trained deep model by adopting a widely used model compression technique, e.g., pruning. In this way, two kinds of data are usually preserved for this compressed model, i.e., non-zero weights and meta-data, where meta-data is employed to help encode and decode these non-zero weights. Although we can obtain an ideally small number of non-zero weights through pruning, existing sparse matrix coding methods still need a much larger amount of meta-data (may several times larger than non-zero weights), which will be a severe bottleneck of the deploying of very deep models. To tackle this issue, we propose a layerwise sparse coding (LSC) method to maximize the compression ratio by extremely reducing the amount of meta-data. We first divide a sparse matrix into multiple small blocks and remove zero blocks, and then propose a novel signed relative index (SRI) algorithm to encode the remaining non-zero blocks (with much less meta-data). In addition, the proposed LSC performs parallel matrix multiplication without full decoding, while traditional methods cannot. Through extensive experiments, we demonstrate that LSC achieves substantial gains in pruned DNN compression (e.g., 51.03x compression ratio on ADMM-Lenet) and inference computation (i.e., time reduction and extremely less memory bandwidth), over state-of-the-art baselines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 11

Similar Papers

Dynamic and Adaptive Threshold for DNN Compression from Scratch
Chunhui Jiang ... Chao Qian
-
Chunhui Jiang, et. al.Chunhui Jiang ... Chao Qian
01 Jan 2017
01 Jan 2017

Structured Compression of Deep Neural Networks with Debiased Elastic Group LASSO
Oyebade K Oyedotun ... Bjorn Ottersten
-
Oyebade K Oyedotun, et. al.Oyebade K Oyedotun ... Bjorn Ottersten
01 Mar 2020
01 Mar 2020

High-performance and energy-efficient deep learning for resource-constrained devices
Ao Ren
-
Ao RenAo Ren
10 May 2021
10 May 2021

Dictionary Pair-based Data-Free Fast Deep Neural Network Compression
Yangcheng Gao ... Haijun Zhang
-
Yangcheng Gao, et. al.Yangcheng Gao ... Haijun Zhang
01 Dec 2021
01 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence