Optimized co-scheduling of mixed-precision neural network accelerator for real-time multitasking applications

Wei Jiang,Ke Jiang,Jinyu Zhan,Xiangyu Wen,Zhiyuan He,Ziwei Song

doi:10.1016/j.sysarc.2020.101775

Abstract

Abstract Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we are approaching to design real-time AI applications on embedded systems with CPU and FPGA (Field Programmable Gate Array) coprocessors. We use dedicated FPGA to accelerate the neural network job and utilize CPU to process the rest jobs of real-time multitasking applications. We devise an Idle-Aware Earliest Deadline First policy to co-schedule the AI applications on hybrid CPU and FPGA coprocessors. Since the implementation of neural network job on FPGA accelerator with different precision configuration will result in different execution time and accuracy, we are also interested in the design optimization of real-time AI applications running on mixed-precision neural network accelerator, with the purpose of maximizing the accuracy related rewards of all applications subject to real-time related constraints. We address the problem as a multi-stage decision procedure, and propose an efficient dynamic programming approach with two pruning policies to reduce the intermediate searching states. Extensive experiments and real-life case evaluations demonstrate the efficiency of the proposed approaches.

Full Text