Video object detection for autonomous driving: Motion-aid feature calibration

Dongfang Liu,Yiming Cui,Yingjie Chen,Jiyong Zhang,Bin Fan

doi:10.1016/j.neucom.2020.05.027

Abstract

This paper proposes an end-to-end deep learning framework, termed as motion-aid feature calibration network (MFCN), for video object detection. The key idea is to leverage on the temporal coherence of video features while considering their motion patterns as captured by optical flow. To boost detection accuracy, the framework aggregates the calibrated features both at pixel and instance levels across frames to achieve improved robustness despite appearance variations. The aggregation and calibration are efficiently and adaptively conducted based on an integrated optical flow network. Meanwhile, the entire architecture of the proposed method is end-to-end, thus significantly improving its training and inference efficiency when compared to multi-stage methods for video object detection. Evaluations on KITTI and ImageNet VID indicate that MFCN can improve the results of a strong still-image detector by 11.2% and 7.31% respectively. MFCN also outperforms other competitive video object detectors and achieves a better trade-off between accuracy and runtime speed, demonstrating its potential for use in autonomous driving systems.

Full Text