Deep structural information fusion for 3D object detection on LiDAR–camera system

Pei An,Junxiong Liang,Kun Yu,Bin Fang,Jie Ma

doi:10.1016/j.cviu.2021.103295

Abstract

3D object detection on LiDAR–camera system is a challenging task, for 3D LiDAR point and 2D RGB image have different data representation. In this paper, We consider that the geometrical consistency in the local 3D and 2D regions is helpful for the regression task in 3D object detection, and propose 3D–2D consistent feature. It is based on hand-crafted 3D and 2D descriptors, generates primary structure feature, and has stable performance in outdoor scenes. Considering that material feature can be used to distinguish different objects, material coefficients ratio (MCR) is proposed to generate primary semantic feature, benefiting the classification task in 3D object detection. It is based on Lambertian model. To take advantage of both 3D–2D consistent feature and MCR, we propose deep 3D–2D structural information fusion (SIF) for 3D object detection. It provides attentional structural voxel feature, used as the input of LiDAR voxel based 3D object detectors. SIF is a light, effective, and explainable module. In the outdoor 3D object detection dataset, extensive experiments demonstrate that SIF improves the performance for both LiDAR voxel based single stage and multi-stage 3D detectors.

Full Text