RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation

Lin Bai,Yecheng Lyu,Xinming Huang

doi:10.1109/tcsi.2020.3038139

Abstract

In recent years, convolutional neural network (CNN) has gained popularity in many engineering applications especially for computer vision. In order to achieve better performance, more complex structures and advanced operations are incorporated into neural networks, which results in very long inference time. For time-critical tasks such as autonomous driving and virtual reality, real-time processing is fundamental. In order to reach real-time processing speed, a lightweight, high-throughput CNN architecture namely RoadNet-RT is proposed for road segmentation in this article. It achieves 92.55% MaxF score on KITTI road segmentation dataset. The inference time is about 9 ms per frame when running on GTX 1080 GPU. Comparing to the state-of-the-art network, RoadNet-RT speeds up the inference time by a factor of 17.8 at the cost of only 3.75% loss in accuracy. What is more, on CamVid dataset its accuracy is 92.98%. Several techniques such as depthwise separable convolution and non-uniformed kernel size convolution are optimized in the hardware accelerator design. The proposed CNN architecture has been successfully implemented on a ZCU102 MPSoC FPGA that achieves the computation capability of 331 GOPS using INT8 quantization. The system throughput reaches 196.7 frames per second with input image size of 280 × 960 . The source code is published at https://github.com/linbaiwpi/RoadNet-RT.

Full Text