CNN-based object detection solutions for embedded heterogeneous multicore SoCs

Xiaowei Li,Zhenyu Quan,Ying Wang,Cheng Wang,Jiajun Li,Yinhe Han,Lili Song

doi:10.1109/aspdac.2017.7858304

Abstract

This paper surveys how to use Convolutional Neural Networks (CNN) to hypothesize object location and categorization from images or videos in mobile heterogeneous SoCs. Recently a variety of CNN-based object detection frameworks have demonstrated both increasing accuracy and speed. Though they are making fast progress in high quality image recognition, state-of-the-art CNN-based detection frameworks seldom discuss their hardware-depended aspects and the cost-effectiveness of real-time image analysis in off-the-shelf low-power devices. As the focus of deep learning and convolutional neural nets is shifting to the embedded or mobile applications with limited power and computational resources, scaling down object detection framework and CNNs is becoming a new and important direction. In this work we conduct a comprehensive comparative study of state-of-the-art real-time object detection frameworks about their performance, cost-effectiveness/energy-efficiency (in the metric of mAP/Wh) in off-the-shelf mobile GPU devices. Based on the analysis results and observation in investigation, we propose to adjust the design parameters of such frameworks and employ a design space exploration procedure to maximize the energy-efficiency (mAP/Wh) of real-time object detection solution in mobile GPUs. As shown in the benchmarking result, we successfully boost the energy-efficiency of multiple popular CNN-based detection solutions by maximizing the utility of computation resources of SoC and trading-off between prediction accuracy and energy cost. In the second Low-Power Image Recognition Challenge (LPIRC), our system achieved the best result measured in mAP/Energy on the embedded Jetson TX1 CPU+GPU SoC.

Full Text