Abstract

Neural network pruning techniques can be effective in accelerating neural network models, making it possible to deploy them on edge devices. In this paper, we propose to prune neural networks using data variance. Unlike other existing methods, this method is somewhat robust and does not invalidate our criteria depending on the number of data batches and the number of training sessions. We also propose a pruning compensation technique. This technique fuses the pruned convolutional information into the remaining convolutional kernel close to it. This fusion operation can effectively help retain the pruned information. We evaluate the proposed method on a number of standard datasets and compare it with several current state-of-the-art methods. Our method always achieves better performance. For example, on Tiny ImageNet, our method can prune 54.2% FLOPs of ResNet50 while obtaining a 0.22% accuracy improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call