Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT

Wenchen He,Shaoyong Guo,Xuesong Qiu,Feng Qi,Song Guo

doi:10.1109/jiot.2020.2981338

Abstract

Nowadays, the widely used Internet-of-Things (IoT) mobile devices (MDs) generate huge volumes of data, which need analyzing and extracting accurate information in real time by compute-intensive deep learning (DL) inference tasks. Due to its multilayer structure, the deep neural network (DNN) is appropriate for the mobile-edge computing (MEC) environment, and the DL tasks can be offloaded to DNN partitions deployed in MEC servers (MECSs) for speed-up inference. In this article, we first assume the arrival process of DL tasks as Poisson distribution and develop a tandem queueing model to evaluate the end-to-end (E2E) inference delay of DL tasks in multiple DNN partitions. To minimize the E2E delay, we develop a joint optimization problem model of partition deployment and resource allocation in MECSs (JPDRA). Since the JPDRA is a mixed-integer nonlinear programming (MINLP) problem, we decompose the original problem into a computing resource allocation (CRA) problem with fixed partition deployment decision and a DNN partition deployment (DPD) problem that optimizes the optimal-delay function related to the CRA problem. Next, we design a CRA algorithm based on Markov approximation and a low-complexity DPD algorithm to obtain the near-optimal solution in the polynomial time. The simulation results demonstrate that the proposed algorithms are more efficient and can reduce the average E2E delay by 25.7% with better convergence performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT

Abstract

Talk to us

Similar Papers

More From: IEEE Internet of Things Journal

Lead the way for us

Journal: IEEE Internet of Things Journal	Publication Date: Oct 1, 2020
Citations: 115

Similar Papers

Computation Resource Allocation in Mobile Blockchain-enabled Edge Computing Networks
Yiping Zuo ... Yu Han
-
Yiping Zuo, et. al.Yiping Zuo ... Yu Han
09 Aug 2020
09 Aug 2020

Task Proactive Caching Based Computation Offloading and Resource Allocation in Mobile-Edge Computing Systems
Hongyu Zhao ... Ying Wang
-
Hongyu Zhao, et. al.Hongyu Zhao ... Ying Wang
01 Jun 2018
01 Jun 2018

NOMA-Assisted Multi-MEC Offloading for IoVT Networks
Fengqian Guo ... Chang Wen Chen
IEEE Wireless Communications | VOL. 28
Fengqian Guo, et. al.Fengqian Guo ... Chang Wen Chen
01 Aug 2021
IEEE Wireless Communications | VOL. 28

Energy Efficiency Based Joint Computation Offloading and Resource Allocation in Multi-Access MEC Systems
Xiaotong Yang ... Hongbo Zhu
IEEE Access | VOL. 7
Xiaotong Yang, et. al.Xiaotong Yang ... Hongbo Zhu
01 Jan 2019
IEEE Access | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT

Abstract

Talk to us

Similar Papers

More From: IEEE Internet of Things Journal