Fast vehicle detection based on colored point cloud with bird’s eye view representation

Lele Wang,Yingping Huang

doi:10.1038/s41598-023-34479-z

Abstract

RGB cameras and LiDAR are crucial sensors for autonomous vehicles that provide complementary information for accurate detection. Recent early-level fusion-based approaches, flourishing LiDAR data with camera features, may not accomplish promising performance ascribable to the immense difference between two modalities. This paper presents a simple and effective vehicle detection method based on an early-fusion strategy, unified 2D BEV grids, and feature fusion. The proposed method first eliminates many null point clouds through cor-calibration. It augments point cloud data by color information to generate 7D colored point cloud, and unifies augmented data into 2D BEV grids. The colored BEV maps can then be fed to any 2D convolution network. A peculiar Feature Fusion (2F) detection module is utilized to extract multiple scale features from BEV images. Experiments on the KITTI public benchmark and Nuscenes dataset show that fusing RGB image with point cloud rather than raw point cloud can lead to better detection accuracy. Besides, the inference time of the proposed method reaches 0.05 s/frame thanks to its simple and compact architecture.

Full Text