Pruning Blocks for CNN Compression and Acceleration via Online Ensemble Distillation

Zongyue Wang,Yangbin Lin,Jiao Xie,Shaohui Lin

doi:10.1109/access.2019.2957203

Abstract

In this paper, we propose an online ensemble distillation (OED) method to automatically prune blocks/layers of a target network by transferring the knowledge from a strong teacher in an end-to-end manner. To accomplish this, we first introduce a soft mask to scale the output of each block in the target network and enforce the sparsity of the mask by sparsity regularization. Then, a strong teacher network is constructed online by replicating the same target networks and ensembling the discriminative features from each target as its new features. Cooperative learning between multiple target networks and the teacher network is further conducted in a closed-loop form, which improves their performance. To solve the optimization problem in an end-to-end manner, we employ the fast iterative shrinkage-thresholding algorithm to fast and reliably remove the redundant blocks, in which the corresponding soft masks are equal to zero. Compared to other structured pruning methods with iterative fine-tuning, the proposed OED is trained more efficiently in one training cycle. Extensive experiments demonstrate the effectiveness of OED, which can not only simultaneously compress and accelerate a variety of CNN architectures but also enhance the robustness of the pruned networks.

Highlights

In recent years, convolutional neural networks (CNNs) have achieved remarkable success in many computer vision tasks, for instance image recognition [24], [39], [70], [72], object detection [13], [65], semantic segmentation [69], etc
Different from them, our method proposes a strong teacher network online without pre-training, providing more knowledge to improve the performance of the target network
Aiming to prune residual blocks, we add the soft mask after each block to determine its importance and propose online ensemble distillation to acquire more knowledge to improve the accuracy of the pruned network

Summary

INTRODUCTION

Convolutional neural networks (CNNs) have achieved remarkable success in many computer vision tasks, for instance image recognition [24], [39], [70], [72], object detection [13], [65], semantic segmentation [69], etc. To reduce the decrease in accuracy, learning-based structured pruning is proposed to train the networks from scratch, with sparse constraints on the weights [2], [75] or the scaling factors [31], [56], [78], using supervised class labels. Lin et al [52] proposed a global and dynamic pruning scheme to reduce the number of redundant filters by greedy alternative updating All these greedy-based pruning methods iteratively prune each filter or layer and retrain the remaining models in a multi-stage manner, which is prohibitively costly when compressing deeper networks. Where the partial LCE (W) with respect to Wl can be calculated by back-propagation and η is the learning rate

ONLINE ENSEMBLE DISTILLATION FOR RESIDUAL BLOCK PRUNING

6: Backward Pass

OPTIMIZATION

EXPERIMENTS

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2019
Citations: 21	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Pruning Blocks for CNN Compression and Acceleration via Online Ensemble Distillation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Significant Edge Detection in Target Network by Exploring Multiple Auxiliary Networks
Nan Du ... Kang Li
-
Nan Du, et. al.Nan Du ... Kang Li
25 Aug 2015
25 Aug 2015

Student-Teacher Oneness: A Storage-efficient approach that improves facial expression recognition
Zhenzhu Zheng ... Xi Peng
-
Zhenzhu Zheng, et. al.Zhenzhu Zheng ... Xi Peng
01 Oct 2021
01 Oct 2021

A novel Fourier-based deconvolution algorithm with improved efficiency and convergence
Linbang Shen ... Yang Yang
Journal of Low Frequency Noise, Vibration and Active Control | VOL. 39
Linbang Shen, et. al.Linbang Shen ... Yang Yang
12 Sep 2019
Journal of Low Frequency Noise, Vibration and Active Control | VOL. 39

Step adaptive fast iterative shrinkage thresholding algorithm for compressively sampled MR imaging reconstruction
Wei Wang ... Ning Cao
Magnetic resonance imaging | VOL. 53
Wei Wang, et. al.Wei Wang ... Ning Cao
07 Jun 2018
Magnetic resonance imaging | VOL. 53

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pruning Blocks for CNN Compression and Acceleration via Online Ensemble Distillation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions