Multi-exit DNN inference acceleration for intelligent terminal with heterogeneous processors

Fang Dong,Guangxing Cai,Jiawei Wang,Jinghui Zhang,Weilong Xin,Dingyang Lv

doi:10.1016/j.suscom.2023.100906

Abstract

Recently, there has been a burgeoning popularity in the deployment of deep learning vision applications upon terminal devices. However, as the number of layers in deep neural networks (DNNs) and structural complexity increase, although the performance of DNN in handling computer vision tasks has become powerful, model inference tasks on computation resource constrained intelligent terminal devices are frequently incapable of meeting latency requirement. A commonly adopted solution to inference acceleration presents multi-exit DNNs to reduce latency via the provision of early exits. However, existing methods do not fully utilize the potential of heterogeneous processors (GPU/CPU) on intelligent terminal devices to cooperatively accelerate multi-exit DNN inference in parallel. Furthermore, the impact of complex image and video input on multi-exit DNNs, as well as the effects of different power consumption modes on processors within intelligent terminal devices, remain inadequately explored. To address these issues, we comprehensively considered the computing performance of heterogeneous processors in different power consumption modes, the structure and characteristic of multi-exit DNNs in inference acceleration, and proposed the Collaborative Inference Acceleration mechanism for intelligent terminal with Heterogeneous Processors (CIAHP). CIAHP includes a deep neural network computation time prediction model and a multi-exit DNN task allocation algorithm with heterogeneous processors. Our experiments demonstrate that CIAHP performs multi-exit DNN inference 2.31× faster than CPU alone, and is 1.23× faster than GPU alone when processing complex image samples.

Full Text