Abstract
State-of-the-art computer vision models are rapidly increasing in capacity, where the number of parameters far exceeds the number required to fit the training set. This results in better optimization and generalization performance. However, the huge size of contemporary models results in large inference costs and limits their use on resource-limited devices. In order to reduce inference costs, convolutional filters in trained neural networks could be pruned to reduce the run-time memory and computational requirements during inference. However, severe post-training pruning results in degraded performance if the training algorithm results in dense weight vectors. We propose the use of Batch Bridgeout, a sparsity inducing stochastic regularization scheme, to train neural networks so that they could be pruned efficiently with minimal degradation in performance. We evaluate the proposed method on common computer vision models VGGNet, ResNet and Wide-ResNet on the CIFAR10 and CIFAR100 image classification tasks. For all the networks, experimental results show that Batch Bridgeout trained networks achieve higher accuracy across a wide range of pruning intensities compared to Dropout and weight decay regularization.
Highlights
The combination of larger GPUs and more effective regularization techniques, such as Dropout [1] and BatchNorm [2], has enabled deep learning practitioners to train larger models with better generalization performance compared to smaller models
Deep neural networks are increasingly deployed to resource-limited devices such as smart phones and internetof-things devices, where these large models would not fit within the memory constraints of the device
We expect that the sparse stochastic regularizers such as Batch Bridgeout could be used to combine the benefits of both sparsity and robustness for obtaining efficient and compact Deep Neural Networks (DNNs) through pruning
Summary
The combination of larger GPUs and more effective regularization techniques, such as Dropout [1] and BatchNorm [2], has enabled deep learning practitioners to train larger models with better generalization performance compared to smaller models. The increase in size and performance comes at the cost of huge computational requirements both for training and inference. Contemporary DNN based computer vision models have a huge run-time memory footprint due to their large number of parameters. Deep neural networks are increasingly deployed to resource-limited devices such as smart phones and internetof-things devices, where these large models would not fit within the memory constraints of the device. A. CLASSIFICATION OF PRUNING TECHNIQUES We broadly classify pruning techniques for lower inference cost based on the elements pruned, the number of train-prune iterations and the criteria used for pruning decisions as follows. Unstructured pruning results in higher compression rates for the same task performance due to the flexibility of fine grained selection of which weights should be eliminated. Unstructured pruning results in sparse weight matrices with the same dimensions as the unpruned ones. Specialized sparse matrix multiplication techniques are needed to exploit the sparsity for faster inference [13]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.