A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices with Heterogeneous Hardware

Zhiyuan Xu,Yanzhi Wang,Jian Tang,Guoliang Xue,Chengxiang Yin,Dejun Yang

doi:10.1109/tmc.2021.3107424

Abstract

With the emergence of more and more powerful chipsets and hardware and the rise of Artificial Intelligence of Things (AIoT), there is a growing trend for bringing Deep Neural Network (DNN) models to empower mobile and edge devices with intelligence such that they can support attractive AI applications in a real-time manner. To leverage heterogeneous computational resources (such as CPU, GPU, DSP, etc.) to effectively and efficiently support the concurrent inference of multiple DNN models on a mobile or edge device, we propose a novel online Co-Scheduling framework based on deep REinforcement Learning, called COSREL. COSREL has the following desirable features: 1) it achieves significant speedup over commonly-used methods by efficiently utilizing all the computational resources on heterogeneous hardware; 2) it leverages emerging Deep Reinforcement Learning (DRL) to make dynamic and wise online scheduling decisions based on system runtime state; 3) it is capable of making a good tradeoff among inference latency, throughput, and energy efficiency; and 4) it makes no changes to given DNN models, thus preserves their accuracies. To evaluate COSREL, we conduct extensive experiments on an off-the-shelf Android smartphone. The experimental results show that COSREL consistently outperforms other baselines in terms of throughput, latency, and energy efficiency.

Full Text