Scheduling of Deep Learning Applications Onto Heterogeneous Processors in an Embedded Device

Duseok Kang,Jinwoo Oh,Soonhoi Ha,Youngmin Yi,Jongwoo Choi

doi:10.1109/access.2020.2977496

Duseok Kang, Jinwoo Oh + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.2977496

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 26	License type: CC BY 4.0

Affiliation: Seoul National University, University of Seoul

Abstract

As the need for on-device machine learning is increasing recently, embedded devices tend to be equipped with heterogeneous processors that include a multi-core CPU, a GPU, and/or a DNN accelerator called a Neural Processing Unit (NPU). In the scheduling of multiple deep learning (DL) applications in such embedded devices, there are several technical challenges. First, a task can be mapped onto a single core or any number of available cores. So we need to consider various possible configurations of CPU cores. Second, embedded devices usually apply Dynamic Voltage and Frequency Scaling (DVFS) to reduce energy consumption at run-time. We need to consider the effect of DVFS in the profiling of task execution times. Third, to avoid overheat condition, it is recommended to limit the core utilization. Lastly, some cores will be shut-down at run-time if core utilization is not high enough, in case the hot-plugging option is turned on. In this paper, we propose a scheduling technique based on Genetic Algorithm to run DL applications on heterogeneous processors, considering all those issues. First, we aim to optimize the throughput of a single deep learning application. Next, we aim to find the Pareto optimal scheduling of multiple DL applications in terms of the response time of each DL application and overall energy consumption under the given throughput constraints of DL applications. The proposed technique is verified with real DL networks running on two embedded devices, Galaxy S9 and HiKey970.

Highlights

As deep learning (DL) is making significant progress in almost all areas of machine learning, more applications based on DL will be seen in our daily life
It is necessary to schedule multiple DL applications on the shared heterogeneous processing elements, which is a challenging problem tackled in this paper
A DL application is usually developed with DL frameworks and libraries such as TensorFlow [1], Caffe2 [2], PyTorch [2], ARM Compute Library (ACL) [3], and so on

Summary

INTRODUCTION

As deep learning (DL) is making significant progress in almost all areas of machine learning, more applications based on DL will be seen in our daily life. We propose a scheduling framework that maps (sub-) layers of a CNN on heterogeneous PEs such as CPU, GPU, and NPU, taking into account several practical issues in the embedded devices. This is the first work to schedule multiple DL applications on heterogeneous PEs with processor sharing, considering both inter-layer parallelism and intra-layer parallelism. RSTensorflow [7] is a framework for users to use heterogeneous processors such as CPU and GPU It uses RenderScript [8] to manually parallelize the computation workloads in a layer, such as matrix multiplication and convolution, across CPU cores and GPUs. Mirhoseini et al [9] proposed a hierarchical DNN model for the efficient placement of a neural network graph onto hardware devices.

PROPOSED SCHEDULING FRAMEWORK AND PROFILING

PROPOSED GA-BASED SCHEDULER

Findings

VIII. CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scheduling of Deep Learning Applications Onto Heterogeneous Processors in an Embedded Device

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Energy-Aware Scenario-Based Mapping of Deep Learning Applications Onto Heterogeneous Processors Under Real-Time Constraints
Jangryul Kim ... Soonhoi Ha
IEEE Transactions on Computers | VOL. 72
Jangryul Kim, et. al.Jangryul Kim ... Soonhoi Ha
01 Jun 2023
IEEE Transactions on Computers | VOL. 72

A Review of the Application of Deep Learning in Brachytherapy
Hai Hu ... Yang Shao
OALib | VOL. 07
Hai Hu, et. al.Hai Hu ... Yang Shao
01 Jan 2020
OALib | VOL. 07

Research on the Application of Computer Deep Learning in Image Processing
Songchun Zhang ... Zihao Zhang
IOP Conference Series: Materials Science and Engineering | VOL. 782
Songchun Zhang, et. al.Songchun Zhang ... Zihao Zhang
01 Mar 2020
IOP Conference Series: Materials Science and Engineering | VOL. 782

TensorRT-Based Framework and Optimization Methodology for Deep Learning Inference on Jetson Boards
Eunjin Jeong ... Soonhoi Ha
ACM Transactions on Embedded Computing Systems | VOL. 21
Eunjin Jeong, et. al.Eunjin Jeong ... Soonhoi Ha
30 Sep 2022
ACM Transactions on Embedded Computing Systems | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scheduling of Deep Learning Applications Onto Heterogeneous Processors in an Embedded Device

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access