Smart-DNN: Efficiently Reducing the Memory Requirements of Running Deep Neural Networks on Resource-constrained Platforms

Zhenbo Hu,Donglei Wu,Yuhong Zhao,Weizhe Zhang,Wen Xia,Xiangyu Zou

doi:10.1109/iccd53106.2021.00087

Abstract

Deep neural networks (DNNs) have gained considerable attention in various real-world applications due to their strong performance in representation learning. However, running a DNN needs tremendous memory resources, which significantly restricts DNN from being applicable on resource-constrained platforms (e.g., IoT, mobile devices, etc.). Lightweight DNNs can accommodate the characteristics of mobile devices, but the hardware resources of mobile or IoT devices are extremely limited, and the resource consumption of lightweight models needs to be further reduced. However, the current neural network compression approaches (i.e., pruning, quantization, knowledge distillation, etc.) works poorly on the lightweight DNNs, which are already simplified. In this paper, we present a novel framework called Smart-DNN, which can efficiently reduce the memory requirements of running DNNs on resource-constrained platforms. Specifically, we slice a neural network into several segments and use SZ error-bounded lossy compression to compress each segment separately while keeping the network structure unchanged. When running a network, we first store the compressed network into memory and then partially decompress the corresponding part layer by layer. According to experimental results on four popular lightweight DNNs (usually used in resource-constrained platforms), Smart-DNN achieves memory saving of 1/10∼1/5, while slightly sacrificing inference accuracy and unchanging the neural network structure with accepted extra runtime overhead.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Smart-DNN: Efficiently Reducing the Memory Requirements of Running Deep Neural Networks on Resource-constrained Platforms

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Delta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity
Zhenbo Hu ... Sian Jin
-
Zhenbo Hu, et. al.Zhenbo Hu ... Sian Jin
17 Aug 2020
17 Aug 2020

Smart-DNN+: A Memory-efficient Neural Networks Compression Framework for the Model Inference
Donglei Wu ... Zhenbo Hu
ACM Transactions on Architecture and Code Optimization | VOL. 20
Donglei Wu, et. al.Donglei Wu ... Zhenbo Hu
26 Oct 2023
ACM Transactions on Architecture and Code Optimization | VOL. 20

MobiFace: A Lightweight Deep Learning Face Recognition on Mobile Devices
Chi Nhan Duong ... Khoa Luu
-
Chi Nhan Duong, et. al.Chi Nhan Duong ... Khoa Luu
01 Sep 2019
01 Sep 2019

Neurosurgeon
Yiping Kang ... Cao Gao
ACM SIGPLAN Notices | VOL. 52
Yiping Kang, et. al.Yiping Kang ... Cao Gao
04 Apr 2017
ACM SIGPLAN Notices | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Smart-DNN: Efficiently Reducing the Memory Requirements of Running Deep Neural Networks on Resource-constrained Platforms

Abstract

Talk to us

Similar Papers