3D Vehicle Detection Using Multi-Level Fusion From Point Clouds and Images

Kun Zhao,Lingfei Ma,Wesley Nunes Goncalves,Li Liu,Jose Marcato Junior,Junbo Wang,Yu Meng,Jonathan Li

doi:10.1109/tits.2021.3137392

Abstract

3D vehicle detectors based on point clouds generally have higher detection performance than detectors based on multi-sensors. However, with the lack of texture information, point-based methods get many missing detection of occluded and distant vehicles, and false detection with high-confidence of similarly shaped objects, which is a potential threat to traffic safety. Therefore, in the long run, fusion-based methods have more potential. This paper presents a multi-level fusion network for 3D vehicle detection from point clouds and images. The fusion network includes three stages: data-level fusion of point clouds and images, feature-level fusion of voxel and Bird’s Eye View (BEV) in the point cloud branch, and feature-level fusion of point clouds and images. Besides, a novel coarse-fine detection header is proposed, which simulates the two-stage detectors, generating coarse proposals on the encoder, and refining them on the decoder. Extensive experiments show that the proposed network has better detection performance on occluded and distant vehicles, and reduces the false detection of similarly shaped objects, proving its superiority over some state-of-the-art detectors on the challenging KITTI benchmark. Ablation studies have also demonstrated the effectiveness of each designed module.

Full Text