WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

Chong Liu,Yuan Yao,Wei Jia,Xinyu Tian,Xingshe Zhou,Gang Yang,Yi Dang

doi:10.1109/icpads56603.2022.00063

Abstract

To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Energy-Aware Scenario-Based Mapping of Deep Learning Applications Onto Heterogeneous Processors Under Real-Time Constraints
Jangryul Kim ... Soonhoi Ha
IEEE Transactions on Computers | VOL. 72
Jangryul Kim, et. al.Jangryul Kim ... Soonhoi Ha
01 Jun 2023
IEEE Transactions on Computers | VOL. 72

Scheduling of Deep Learning Applications Onto Heterogeneous Processors in an Embedded Device
Duseok Kang ... Jongwoo Choi
IEEE Access | VOL. 8
Duseok Kang, et. al.Duseok Kang ... Jongwoo Choi
01 Jan 2020
IEEE Access | VOL. 8

Deep Learning Inference Parallelization on Heterogeneous Processors With TensorRT
Eunjin Jeong ... Jaeseong Lee
IEEE Embedded Systems Letters | VOL. 14
Eunjin Jeong, et. al.Eunjin Jeong ... Jaeseong Lee
01 Mar 2022
IEEE Embedded Systems Letters | VOL. 14

Prototyping of Low-Cost Configurable Sparse Neural Processing Unit with Buffer and Mixed-Precision Reshapeable MAC Array
Binyi Wu ... Bernd Waschneck
-
Binyi Wu, et. al.Binyi Wu ... Bernd Waschneck
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

Abstract

Talk to us

Similar Papers