3D object detection approaches from point clouds develop rapidly. However, the distribution of point clouds is unbalanced in the real scene, and thus the distant or occluded objects suffer from too few points to be perceived. This case damages the overall detection accuracy. Hence, we propose a novel two-stage 3D object detection framework, the Shape-Enhancement and Geometry-Aware Network (SEGANet), which aims to mitigate the negative impact of unbalanced point distribution for boosting detection performance. In stage 1, we first capture fine-grained structural knowledge with the assistance of point-wise features from voxels to generate proposals. And in stage 2, we construct a shape enhancement module to reconstruct complete surface points for objects within proposals, then establish an elaborate geometric relevance-aware Transformer module to aggregate high-correlated feature pairs of reconstructed-known parts and decode vital geometric relations of aggregated features. Thus, critical geometric clues are supplied for objects from the data and feature levels, achieving enhanced features for box refinement. Extensive experiments on KITTI and Waymo datasets show that SEGANet achieves low model complexity and excellent detection accuracy, especially leading the baseline method by 2.18% gain in overall detection accuracy and 1.8% gain in average accuracy of weakly sensing objects. This verifies that SEGANet effectively alleviates the impact of point imbalance to significantly boost detection performance.
Read full abstract