Abstract

With the emergence of more and more powerful chipsets and hardware and the rise of Artificial Intelligence of Things (AIoT), there is a growing trend for bringing Deep Neural Network (DNN) models to empower mobile and edge devices with intelligence such that they can support attractive AI applications in a real-time manner. To leverage heterogeneous computational resources (such as CPU, GPU, DSP, etc.) to effectively and efficiently support the concurrent inference of multiple DNN models on a mobile or edge device, we propose a novel online Co-Scheduling framework based on deep REinforcement Learning, called COSREL. COSREL has the following desirable features: 1) it achieves significant speedup over commonly-used methods by efficiently utilizing all the computational resources on heterogeneous hardware; 2) it leverages emerging Deep Reinforcement Learning (DRL) to make dynamic and wise online scheduling decisions based on system runtime state; 3) it is capable of making a good tradeoff among inference latency, throughput, and energy efficiency; and 4) it makes no changes to given DNN models, thus preserves their accuracies. To evaluate COSREL, we conduct extensive experiments on an off-the-shelf Android smartphone. The experimental results show that COSREL consistently outperforms other baselines in terms of throughput, latency, and energy efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call