Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study

Isma-Ilou Sadou,Zhonghai Lu,Masoumeh Ebrahimi,Seyed Morteza Nabavinejad

doi:10.1109/dsd57027.2022.00036

Abstract

From object detection to semantic segmentation, deep learning has achieved many groundbreaking results in recent years. However, due to the increasing complexity, the execution of neural networks on embedded platforms is greatly hindered. This has motivated the development of several neural network minimisation techniques, amongst which pruning has gained a lot of focus. In this work, we perform a case study on a series of methods with the goal of finding a small model that could run fast on embedded devices. First, we suggest a simple, but effective, ranking criterion for filter pruning called Mean Weight. Then, we combine this new criterion with a threshold-aware layer-sensitive filter pruning method, called T-sensitive pruning, to gain high accuracy. Further, the pruning algorithm follows a structured filter pruning approach that removes all selected filters and their dependencies from the DNN model, leading to less computations, and thus low inference time in lower-end CPUs. To validate the effectiveness of the proposed method, we perform experiments on three different datasets (with 3, 101, and 1000 classes) and two different deep neural networks (i.e., SICK-Net and MobileNet V1). We have obtained speedups of up to 13x on lower-end CPUs (Armv8) with less than 1% drop in accuracy. This satisfies the goal of transferring deep neural networks to embedded hardware while attaining a good trade-off between inference time and accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Rapid and High-Purity Seed Grading Based on Pruned Deep Convolutional Neural Network
Huanyu Li ... Chunlei Li
-
Huanyu Li, et. al.Huanyu Li ... Chunlei Li
01 Jan 2021
01 Jan 2021

A Novel Filter-Level Deep Convolutional Neural Network Pruning Method Based on Deep Reinforcement Learning
Yihao Feng ... Qingwen Li
Applied Sciences | VOL. 12
Yihao Feng, et. al.Yihao Feng ... Qingwen Li
10 Nov 2022
Applied Sciences | VOL. 12

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34
--
12 Apr 2022
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34

Deep distributed convolutional neural networks: Universality
Ding-Xuan Zhou
Analysis and Applications | VOL. 16
Ding-Xuan ZhouDing-Xuan Zhou
01 Nov 2018
Analysis and Applications | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study

Abstract

Talk to us

Similar Papers