Accelerating Convolutional Neural Networks with Dynamic Channel Pruning

Zhang Chiliang,Hu Tao,Guan Yingda,Ye Zuochang

doi:10.1109/dcc.2019.00075

Abstract

Network acceleration has become a hot topic, for the substantial challenge in deploying such networks in real-time applications or on resource-limited devices. A wide variety of pruning-based acceleration methods were proposed to expend the sparsity of parameters, thus omit computations involving those pruned parameters. However, these element-wise pruning methods can hardly be efficiently used for accelerating without special-customized speed-up algorithms. Due to this difficulty, recent work has turned to prune filters or channels instead, which directly reduce the number of matrix multiplications. While Channel Pruning method reforms the original CNNs to a kernel-wisely or channel-wisely pruned one, Runtime Neural Pruning (RNP) argues that models pruned with static pruning methods will lose the ability for some hard tasks since some potentially significant weights are lost during the pruning process. Dynamically pruning the channels is found to be a good solution. In this paper, we propose to use Channel Threshold-Weighting (T-Weighting) modules to choose and prune unimportant feature channels at inference phase. As the pruning is done dynamically, it is called Dynamic Channel Pruning (DCP). DCP consists of the original convolutional network and a number of "Channel T-Weighting" modules at certain layers. The "Channel T-Weighting" module assigns weights to corresponding channels, pruning those channels whose weights are zero. Those pruned channels make the CNN accelerated, and those remained channels multiplying with weights help feature expression enhanced. The reason for not considering fully-connected layers are two-fold: 1. convolution operations occupying the vast majority of all computation cost. 2. DCP is not designed only for classification, but for many tasks taking CNN as their backbone networks. In this work, we propose as a specific choice for h(·) the thresholded sigmoid function to offer sparsity to w_l, called thresholded sigmoid (T-sigmoid), h(x) = σ(x)· 1{x > T}, where σ(·) refers to sigmoid function. 1{x} is boolean indicator function, where output being 1 when input x is True, and vice versa. The T-sigmoid function is inspired by spike-and-slab models, which formulates distributions over hidden variables as the product of a binary spike variable and a real-valued code. The DCP is trained in a layer-by-layer manner. We first train the "Channel T-Weighting" module, and then set the threshold based on the given pruned ratio, and adjust the threshold in an iterative way at the end. The proposed DCP could reach 5× speed-up with only 4.77% drops on ILSVRC2012 dataset. Comparing the increasing error with baseline methods (Filter Pruning, Channel Pruning and RNP), DCP outperforms other methods consistently as the speed-up ratio increasing. The experiment show that DCP also consistently outperforms the baseline model whenever for Cifar10 and Cifar100. By comparing the full model and accelerated model (3×), we can see that DCP generalized well on scenes classification task (on the Places365-Challenge dataset) with VGG-16, with the top-1 accuracy top-5 accuracy dropping 2.07% and 1.96% respectively. DCP (3×) trained with ResNet-50 also suffered slight drops, with the top-1 accuracy top-5 accuracy dropping 2.78% and 2.55% respectively, outperforming Channel Pruning (our impl.) by a large margin. For the detection task on the PASCAL VOC2007 dataset using Faster R-CNN, we observe 0.5% mAP drops and 1.7% mAP drops of our 2× acceleration model and 4× acceleration model respectively, showing little accuracy degradation, showing a competitive result for proving DCP generalized well on detection task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accelerating Convolutional Neural Networks with Dynamic Channel Pruning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

CPRNC: Channels pruning via reverse neuron crowding for model compression
Pingfan Wu ... Ningzhong Liu
Computer Vision and Image Understanding | VOL. 240
Pingfan Wu, et. al.Pingfan Wu ... Ningzhong Liu
23 Jan 2024
Computer Vision and Image Understanding | VOL. 240

An Efficient Channel-level Pruning for CNNs without Fine-tuning
Zhongtian Xu ... Jingwei Sun
-
Zhongtian Xu, et. al.Zhongtian Xu ... Jingwei Sun
18 Jul 2021
18 Jul 2021

Research on Gear Surface Damage Identification Based on the ResNet Network
Jiayin Qin ... Luji Wu
Journal of Physics: Conference Series | VOL. 2419
Jiayin Qin, et. al.Jiayin Qin ... Luji Wu
01 Jan 2023
Journal of Physics: Conference Series | VOL. 2419

Conditional Automated Channel Pruning for Deep Neural Networks
Yixin Liu ... Jian Chen
IEEE Signal Processing Letters | VOL. 28
Yixin Liu, et. al.Yixin Liu ... Jian Chen
01 Jan 2020
IEEE Signal Processing Letters | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accelerating Convolutional Neural Networks with Dynamic Channel Pruning

Abstract

Talk to us

Similar Papers