Probabilistic instance shape reconstruction with sparse LiDAR for monocular 3D object detection

Chaofeng Ji,Han Wu,Guizhong Liu

doi:10.1016/j.neucom.2023.01.080

Abstract

Monocular 3D object detection aims to localize objects in 3D space from a single image. This is a difficult problem due to the lack of accurate depth measurements. Many methods predict depths upfront using a pre-trained vision-based depth estimator to assist in 3D object detection. However, the methods show limited improved performances because of the depth inaccuracy and the neglect of depth confidence. In this paper, we propose a new end-to-end 3D object detection framework by combining a monocular camera with a cheap 4-beam LiDAR. The minimal LiDAR signal is leveraged as an additional input to predict high-quality and dense depth maps from monocular images. Meanwhile, several 3D proposals are generated by a keypoint-based detector. The key challenge is encoding the depth confidence to capture the depth estimation uncertainty. Therefore, we propose probabilistic instance shape reconstruction to exploit instance shape information for box refinement. Our method employs a fully differentiable end-to-end framework, making it simple and efficient. The experimental results on the KITTI dataset demonstrate that the proposed method achieves state-of-the-art performance, thus validating the effectiveness of the sparse LiDAR data and the probabilistic instance shape reconstruction. Code is available at https://github.com/xjtuwh/SparseLiDAR_fusion.

Full Text