Abstract
Deploying deep learning models on resource-constrained (edge) devices is challenging due to their high computational demands and large model sizes. Early-exit neural networks are one of the approaches to make deep learning models more efficient for resource-constrained devices by reducing computational cost and latency. However, even with early-exit neural networks, the model size may remain a problem when deploying them on edge devices. To address this problem, we propose a section-wise model compression technique for compressing an early-exit neural network with intermediate classifiers. Our approach divides the model into a few sections and uses different compression settings in the weight clustering-based compression for each section to prevent accuracy loss in the intermediate sections. We demonstrate that knowledge distillation can be used in the retraining phase to transfer knowledge from uncompressed to compressed sections and to accelerate the recovery of performance reduction after the weight clustering stages. The performance evaluation of our proposed method on CIFAR10 and CIFAR100 datasets using ResNet and WideResNet architectures demonstrates that the proposed technique can compress an early-exit neural network with a high compression ratio with minimal impact on the accuracy of intermediate classifiers. The proposed method achieves compression ratios of more than 36 and 22 times for ResNet18 with three shallow classifiers on CIFAR10 and CIFAR100, respectively, with an ensemble accuracy loss of less than 1%. By eliminating shallow classifiers from the early-exit model, the static model can achieve compression ratios of up to 64 and 52 times for ResNet18 and WideResNet50, respectively, on the CIFAR10 dataset with an accuracy loss of less than 2.5%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.