EdgeCompress: Coupling Multidimensional Model Compression and Dynamic Inference for EdgeAI

Hao Kong,Xiangzhong Luo,Ravi Subramaniam,Di Liu,Shuo Huai,Christian Makaya,Weichen Liu,Qian Lin

doi:10.1109/tcad.2023.3276938

Abstract

Convolutional neural networks (CNNs) have demonstrated encouraging results in image classification tasks. However, the prohibitive computational cost of CNNs hinders the deployment of CNNs onto resource-constrained embedded devices. To address this issue, we propose, a comprehensive compression framework to reduce the computational overhead of CNNs. In, we first introduce dynamic image cropping, where we design a lightweight foreground predictor to accurately crop the most informative foreground object of input images for inference, which avoids redundant computation on background regions. Subsequently, we present compound shrinking to collaboratively compress the three dimensions (depth, width, and resolution) of CNNs according to their contribution to accuracy and model computation. Dynamic image cropping and compound shrinking together constitute a multi-dimensional CNN compression framework, which is able to comprehensively reduce the computational redundancy in both input images and neural network architectures, thereby improving the inference efficiency of CNNs. Further, we present a dynamic inference framework to efficiently process input images with different recognition difficulties, where we cascade multiple models with different complexities from our compression framework and dynamically adopt different models for different input images, which further compresses the computational redundancy and improves the inference efficiency of CNNs, facilitating the deployment of advanced CNNs onto embedded hardware. Experiments on ImageNet-1K demonstrate that reduces the computation of ResNet-50 by 48.8% while improving the top-1 accuracy by 0.8%. Meanwhile, we improve the accuracy by 4.1% with similar computation compared to HRank. the state-of-the-art compression framework. The source code and models are available at

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

EdgeCompress: Coupling Multidimensional Model Compression and Dynamic Inference for EdgeAI

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Dec 1, 2023
License type: mit

Similar Papers

An End-to-End Compression Framework Based on Convolutional Neural Networks
Feng Jiang ... Shaohui Liu
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 28
Feng Jiang, et. al.Feng Jiang ... Shaohui Liu
01 Oct 2018
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 28

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
Xinyi Du-Harpur ... Magnus D Lynch
Journal of Investigative Dermatology | VOL. 141
Xinyi Du-Harpur, et. al.Xinyi Du-Harpur ... Magnus D Lynch
12 Sep 2020
Journal of Investigative Dermatology | VOL. 141

Comparative Analysis of the Application of Multilayer and Convolutional Neural Networks for Recognition of Handwritten Letters of the Azerbaijani Alphabet
Elshan Mustafayev ... Rustam Azimov
Cybernetics and Computer Technologies | VOL. -
Elshan Mustafayev, et. al.Elshan Mustafayev ... Rustam Azimov
30 Sep 2021
Cybernetics and Computer Technologies | VOL. -

INTELLIGENT MODEL FOR CLASSIFYING HEMODYNAMIC PATTERNS OF BRAIN ACTIVATION TO IDENTIFY NEUROCOGNITIVE MECHANISMS OF SPATIAL-NUMERICAL ASSOCIATIONS
R G Asadullaev ... M A Sitnikova
Vestnik komp'iuternykh i informatsionnykh tekhnologii | VOL. -
R G Asadullaev, et. al.R G Asadullaev ... M A Sitnikova
01 Jan 2024
Vestnik komp'iuternykh i informatsionnykh tekhnologii | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

EdgeCompress: Coupling Multidimensional Model Compression and Dynamic Inference for EdgeAI

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems