Abstract

In recent years, depth and width Convolution Neural Networks (CNNs) have achieved significant results in various fields of machine vision. To be honest, deploying the depth models to devices directly is Incredible. Reducing the width of the framework is an effective strategy to make the model more slender. In this paper, we propose a channel pruning method to simultaneously accelerate and compress deep CNNs while maintaining their accuracy. First, a pre-trained CNN model is evaluated layer-by-layer according to the hybrid statistics-based criterion. The negative-score channel and the corresponding filters are discarded. For the performance damaged by channel pruning, knowledge distillation is adopted to fine-tune to achieve stronger generalization ability. Experiments on the ILSVRC-12 benchmark demonstrate the effectiveness of our method when applying to the popular CNN framework. We achieve $3.79\times$ speed-up and $19.1\times$ compression baselines on VGG-16, $3.6\times$ acceleration and $5.5\times$ compression on Darknet (fully convolutional network), both with a little accuracy decrease. Even for modern networks like ResNet, we achieve $2.1\times$ speed-up and $2.2\times$ compression, suffering only 0.4% accuracy loss.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call