Abstract
Deployment of the deep neural networks (DNNs) on resource-constrained devices is a challenging task due to their limited memory and computational power. In most cases, the pruning techniques do not prune the DNNs to full extent and redundancy still exists in these models. Considering this, a mixed filter pruning approach based on principal component analysis (PCA) and geometric median is presented. First, a pre-trained model is analyzed by using PCA to identify the important filters for every layer. These important filters are then used to reconstruct the network with a fewer number of layers and a fewer number of filters per layer. A new network with optimized structure is constructed and trained on the data. Secondly, the trained model is then analyzed using geometric median as a base. The redundant filters are identified and removed which results in further compression of the network. Finally, the pruned model is fine tuned to regain the accuracy. Experiments on CIFAR-10, CIFAR-100 and ILSVRC 2017 datasets show that the proposed mixed pruning approach is feasible and can compress the network to greater extent without any significant loss to accuracy. With VGG-16 on CIFAR-10, the number of operations and parameters are reduced to 18.56× and 3.33×, respectively, with almost 1% loss in the accuracy. The compression rate for AlexNet on CIFAR-10 dataset is 2.61× and 4.85× in terms of number of operations and number of parameters, respectively, with a gain of 1.2% in the accuracy. On CIFAR-100, VGG-19 is compressed by 16.02 X in terms of number of operations and 36× in terms of number of parameters with a 2.6% loss of accuracy. Similarly, the compression rate for VGG-19 network on ILSVRC 2017 dataset is 1.87× and 2.4× for operations and parameters with 0.5% loss in accuracy.
Highlights
Convolutional Neural Networks (CNNs) have achieved state of the art performance in many applications such as face recognition [1], object detection [2], semantic segmentation [3] and other classification tasks
The modern deep neural networks are computationally expensive and memory intensive and require more computational power for deployment and training, it has become a challenge to bring the advances in neural network technology to mobile devices
Much work has been done in recent years, focused on reducing the size of pre-trained neural networks, making them capable to be deployed on mobile devices for inferences [4, 5]
Summary
Convolutional Neural Networks (CNNs) have achieved state of the art performance in many applications such as face recognition [1], object detection [2], semantic segmentation [3] and other classification tasks. Much work has been done in recent years, focused on reducing the size of pre-trained neural networks, making them capable to be deployed on mobile devices for inferences [4, 5] The latest architectures such as inception module [6] or residual connection [7] have millions of parameters which require extensive computation and storage power. These architectures produce state of the art accuracy and most of the designers start with pre-trained networks for transfer learning purposes. These networks are rarely evaluated on the given datasets and only the classifier is trained and fine-tuned. It is of great importance to devise deep neural network models with relatively low complexity and high accuracy
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.