Abstract

Network pruning is an efficient approach to adapting large-scale deep neural networks (DNNs) to resource-constrained systems; the networks are pruned using the predefined pruning criteria or a flexible network structure is explored with the help of neural architecture search, (NAS). However, the former crucially relies on the human expert knowledge, while the latter usually requires one to make many simplifications to ensure the efficiency of the search, resulting in limited performance. This paper presents a new pruning approach called Progressive Differentiable Architecture Search (PDAS) that realizes a better balance between computation efficiency and model performance. First, a joint search-update scheme for search optimization is presented; it constantly refines the candidate number of channels in each layer by performing differentiable searching and evolutionary updating alternately. The latter can provide new high-probability candidates continuously to avoid local minimum point. Second, a two-stage constrained progressive search strategy is presented for some complex nonlinear networks (such as ResNet) that are more difficult to prune for existing approaches; it effectively avoids the over-fitting problem caused by the excessive search space and largely reduces the consumption of 1x1 convolution in the skip connections of the residual blocks with little accuracy loss. Extensive experiments on some representative datasets (such as CIFAR-10, CIFAR-100, and ImageNet) approve the superior performance of PDAS compared to most existing network pruning algorithms available. Notably, compared to the state-of-the-art LFPC, PDAS can even prune about 8% more FLOPs on ResNet-110 (on CIFAR-10) and ResNet-50 (on ImageNet), respectively, while coming with almost identical and ignorable accuracy losses.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call