Abstract
Because of resource-constrained environments, network compression has become an essential part of deep neural networks research. In this paper, we found a mutual relationship between kernel weights termed as Inter-Layer Kernel Correlation (ILKC). The kernel weights between two different convolution layers share a substantial similarity in shapes and values. Based on this relationship, we propose a new compression method, Inter-Layer Kernel Prediction (ILKP), which represents convolutional kernels with fewer bits through similarity between kernel weights in convolutional neural networks. Furthermore, to effectively adapt the inter prediction scheme from video coding technology, we integrate a linear transformation into the prediction scheme, which significantly enhances compression efficiency. The proposed method achieved 93.77% top-1 accuracy with $4.1\times $ compression ratio compared to the ResNet110 baseline model on CIFAR10. It means that 0.04% top-1 accuracy improvement was achieved by using less memory footprint. Moreover, incorporating quantization, the proposed method achieved a $13\times $ compression ratio with little performance degradation compared to the ResNets baseline model trained on CIFAR10 and CIFAR100.
Highlights
Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), are showing exceptional performance compared with traditional methods for a wide variety of tasks in many fields such as image classification [1]–[3], object detection [4]–[6], and speech recognition [7], [8]
Based on Inter-Layer Kernel Correlation (ILKC), this paper proposes a simple and effective weight compression method InterLayer Kernel Prediction (ILKP) that effectively shares the weights by prediction
Based on ILKC, we propose a simple and useful model weight compression method, Inter-Layer Kernel Prediction (ILKP) that minimizes the weight sizes by prediction
Summary
Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), are showing exceptional performance compared with traditional methods for a wide variety of tasks in many fields such as image classification [1]–[3], object detection [4]–[6], and speech recognition [7], [8] With this performance improvement, the size of the CNN model has increased enormously, and the recent works are expanding in size with more model parameters for better performance. Representative methods include pruning [9]–[12], quantization [13]–[15], knowledge distillation [16]–[18], weight sharing [19]–[22], and efficient structural design methods, e.g., Depthwise Separable Convolution [23]–[26] These methods are widely used to compress the size of the CNN model. Based on ILKC, this paper proposes a simple and effective weight compression method InterLayer Kernel Prediction (ILKP) that effectively shares the weights by prediction
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.