Power-efficient Neural Network Scheduling

Ying Wang,Xuyi Cai,Xiandong Zhao

doi:10.1201/9781003162810-7

Abstract

The powerful deep neural networks (DNNs) have been propelling the development of efficient computer vision technologies for mobile systems such as phones and drones. To enable power-efficient image processing on resource-constrained devices, many studies have been dedicated to the field of low-power DNNs from different layers of the systems. Amongst the deep stack of low-power DNN systems, task scheduling also plays an essential role as the interfacing middleware between the algorithms and the underlying hardware. Especially when heterogeneous SoCs have been widely adopted in edge and mobile scenarios as the hardware solution, an efficient DNN task scheduler is needed to reduce the implementation overhead of DNN-based task and extract the most power from the SoC platform. This chapter will firstly exemplify DNN scheduling with the image recognition solution of LPIRC-2016 and introduce how to efficiently schedule a DNN-based visual processing task onto a typical heterogeneous SoC composed of general-purpose and specialized cores. After the elaborate task-level scheduling strategy, we will discuss the fine-grained DNN-wise scheduling policy on specialized DNN cores and show the effectiveness of memory-oriented DNN-layer scheduling. Last, since model quantization is an indispensable step to map a large-size neural network model onto the resource-thrifty mobile SoCs, we will discuss the implication of DNN quantization on the heterogeneous SoCs integrated with both integer and float-point cores, and then introduce the scheduler-friendly DNN quantizer for pure-integer hardware. Although most prior works on low-power DNNs focused their attention on efficient network and hardware architectures, it is shown that the scheduler-level optimization technology will also be critical to the energy-efficiency of the system, particularly when the algorithmic implementation is fixed and off-the-shelf hardware devices are adopted. Take-aways Demonstrates the rank-1 solution of LPIRC2016 as a case study to introduce the basic coarse-grained scheduling techniques for DNN-based applications. Presents the memory-efficient fine-grained neural network scheduler on DNN processors. Introduces the scheduler-friendly quantization technique to reduce the overhead of neural network implementation on embedded SoCs.

Full Text