Abstract

Deploying deep convolutional neural networks (DCNNs) on devices with low memory resources or in applications with strict latency requirements remains a challenge. The weight-based filter pruning is an effective technique that has been widely applied to DCNNs compression and acceleration due to its lighter computation consumption and better flexibility. However, many existing methods select the filter to be pruned by evaluating the direct effect of the filter on the current pruning layer, resulting in insufficient performance of the pruned model. In this paper, we point out that the key issue for the filter level pruning criteria to improve performance is how to evaluate the importance of filters and propose a new weight-based filter pruning method. The proposed method comprehensively considers the direct and indirect effects of filters, which can better reflect the filter importance, allowing the safe removal of unimportant filters. Extensive experiments demonstrate that the proposed weight-based method achieves a better performance than previous works, which are reaching the level of data-based methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call