VSL-Net: Voxel structure learning for 3D object detection

Zufeng Zhang,Chongben Tao,Zhen Gao,Feng Zhou,Jun Xue,Yuan Zhu,Feng Cao

doi:10.1016/j.aei.2023.102348

Abstract

Current detection methods with single stage generally lack contextual structure information, the classification and location confidence are inconsistent, which are not able to achieve accurate dynamic multi-object detection. Therefore, a VSL-Net method is proposed based on single-stage detection framework, which maps voxels to the point cloud space through voxel structure learning for foreground point segmentation and center estimation. In the voxel feature extraction network, the submanifold sparse convolution is proposed to improve convolution efficiency by reducing operation of null region. In the detection network, a position alignment module is proposed to sample and interpolate in Bird’s Eye View. Characteristics of surrounding points are integrated to further improve the accuracy of detection. The proposed VSL-Net method was compared with other excellent method on the Kitti, Waymo and Nuscene datasets. The vehicle detection accuracy of VSL-Net reached 92.12%. Ablation and accuracy detection experiments have completed on a real physical platform. Experiment results showed that the proposed method had portability, strong generalization ability and high detection accuracy.

Full Text