Most network pruning methods focus on identifying redundant channels from pre-trained models, which is inefficient due to its three-step process: pre-training, pruning and fine-tuning, and reconfiguration. In this paper, we propose a pruning-from-scratch framework that unifies these processes into a single approach. We introduce nuclear norm-based regularization to maintain the representational capacity of large networks during pruning. Combining this with MACs-based regularization enhances the performance of the pruned network at the target compression rate. Our bi-level optimization approach simultaneously improves pruning efficiency and representation capacity. Experimental results show that our method achieves 75.4% accuracy on ImageNet without a pre-trained network, using only 41% of the original model’s computational cost. It also attains 0.5% higher performance in compressing the SSD network for object detection. Furthermore, we analyze the effects of nuclear norm-based regularization.
Read full abstract