RT-mDL

Neiwen Ling,Daqi Xie,Kai Wang,Yuze He,Guoliang Xing

doi:10.1145/3485730.3485938

Abstract

Recent years have witnessed an emerging class of real-time applications, e.g., autonomous driving, in which resource-constrained edge platforms need to execute a set of real-time mixed Deep Learning (DL) tasks concurrently. Such an application paradigm poses major challenges due to the huge compute workload of deep neural network models, diverse performance requirements of different tasks, and the lack of real-time support from existing DL frameworks. In this paper, we present RT-mDL, a novel framework to support mixed real-time DL tasks on edge platform with heterogeneous CPU and GPU resource. RT-mDL aims to optimize the mixed DL task execution to meet their diverse real-time/accuracy requirements by exploiting unique compute characteristics of DL tasks. RT-mDL employs a novel storage-bounded model scaling method to generate a series of model variants, and systematically optimizes the DL task execution by joint model variants selection and task priority assignment. To improve the CPU/GPU utilization of mixed DL tasks, RT-mDL also includes a new priority-based scheduler which employs a GPU packing mechanism and executes the CPU/GPU tasks independently. Our implementation on an F1/10 autonomous driving testbed shows that, RT-mDL can enable multiple concurrent DL tasks to achieve satisfactory real-time performance in traffic light detection and sign recognition. Moreover, compared to state-of-the-art baselines, RT-mDL can reduce deadline missing rate by 40.12% while only sacrificing 1.7% model accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

RT-mDL

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Accelerating Deep Learning Tasks with Optimized GPU-assisted Image Decoding
Lipeng Wang ... Shengen Yan
-
Lipeng Wang, et. al.Lipeng Wang ... Shengen Yan
01 Dec 2020
01 Dec 2020

Effectiveness of Moldable and Malleable Scheduling in Deep Learning Tasks
Ikki Fujiwara ... Kentaro Torisawa
-
Ikki Fujiwara, et. al.Ikki Fujiwara ... Kentaro Torisawa
01 Dec 2018
01 Dec 2018

Cooperative Distributed GPU Power Capping for Deep Learning Clusters
Dong-Ki Kang ... Chan-Hyun Youn
IEEE Transactions on Industrial Electronics | VOL. 69
Dong-Ki Kang, et. al.Dong-Ki Kang ... Chan-Hyun Youn
01 Jul 2022
IEEE Transactions on Industrial Electronics | VOL. 69

Joint DNN Partition and Resource Allocation for Task Offloading in Edge–Cloud-Assisted IoT Environments
Wenhao Fan ... Fan Wu
IEEE Internet of Things Journal | VOL. 10
Wenhao Fan, et. al.Wenhao Fan ... Fan Wu
15 Jun 2023
IEEE Internet of Things Journal | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RT-mDL

Abstract

Talk to us

Similar Papers