Voxel-RCNN-Complex: An Effective 3-D Point Cloud Object Detector for Complex Traffic Conditions

Hai Wang,Yingfeng Cai,Yicheng Li,Long Chen,Zhixiong Li,Zhiyu Chen,Miguel Angel Sotelo

doi:10.1109/tim.2022.3165251

Abstract

The complex traffic conditions and high traffic flow are big challenges to the perception of autonomous vehicles. As the basis of environmental perception technology, object detection based on point cloud is of great significance for the normal operations of autonomous vehicles. Considering the complex traffic conditions, in this work, we use the One millioN sCenEs (ONCE) dataset to train an effective 3-D object detector, namely Voxel-region convolution neural network (RCNN)-Complex. This is accomplished by modifying the Voxel RCNN to make it suitable for complex traffic conditions. We add the residual structures in the 3-D backbone and design a heavy 3-D feature extractor, which is conducive to extracting high-dimensional information. We also design a 2-D backbone composed of residual structures, self-calibration convolution, and spatial attention and channel attention mechanism; this expands the receptive field and captures more context information. As compared with the Voxel RCNN, the proposed Voxel-RCNN-Complex significantly improves the detection performance for long-distance and small objects. In order to further increase the robustness of the proposed model and alleviate category imbalance, we use a class-balanced sampling strategy (CBSS). We evaluate the proposed model using the ONCE dataset. The results show that the proposed model achieves an mAP of 65.34% and an inference speed of 13.8 FPS. The experiments show that the proposed model performs better than other methods on the ONCE dataset. This demonstrates the effectiveness of the proposed Voxel-RCNN-Complex. Moreover, we also test the proposed model in an intelligent vehicle platform on real roads.

Full Text