Abstract
To compress the neural network model, structured pruning has been proposed. However, finding a proper pruning rate to suppress the accuracy degradation of pruned models is difficult because existing structured pruning methods assign the pruning rate manually. As described herein, we propose an automatic pruning rate derivation method for structured pruning to reduce the workload of inefficient manual pruning rate assignment. The value of the pruning error (L1-norm of the pruned weight) depends on the pruning rate. Therefore, to derive the pruning rate, our method compares the pruning error and the threshold. When the pruning error is less than the threshold, the degradation of the pruned model accuracy is suppressed. We demonstrate the superiority of our proposed method over state-of-the-art methods on CIFAR-10 and ImageNet using various ResNets. For example, the proposed method reduces 56.2% parameters of ResNet-50 with similar accuracy of 75.32% to earlier works on ImageNet.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have