Abstract

Background: 3D object detection based on point clouds in road scenes has attracted much attention recently. The voxel-based methods voxelize the scene to regular grids, which can be processed with the advanced feature learning frameworks based on convolutional layers for semantic feature learning. The point-based methods can extract the geometric feature of the point due to the coordinate reservations. The combination of the two is effective for 3D object detection. However, the current methods use a voxel-based detection head with anchors for classification and localization. Although the preset anchors cover the entire scene, it is not suitable for detection tasks with larger scenes and multiple categories of objects, due to the limitation of the voxel size. Additionally, the misalignment between the predicted confidence and proposals in the Regions of the Interest (ROI) selection bring obstacles to 3D object detection. Methods: We investigate the combination of voxel-based methods and point-based methods for 3D object detection. Additionally, a voxel-to-point module that captures semantic and geometric features is proposed in the paper. The voxel-to-point module is conducive to the detection of small-size objects and avoids the presets of anchors in the inference stage. Moreover, a confidence adjustment module with the center-boundary-aware confidence attention is proposed to solve the misalignment between the predicted confidence and proposals in the regions of the interest selection. Results: The proposed method has achieved state-of-the-art results for 3D object detection in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) object detection dataset. Actually, as of September 19, 2021, our method ranked 1st in the 3D and Bird Eyes View (BEV) detection of cyclists tagged with difficulty level ‘easy’, and ranked 2nd in the 3D detection of cyclists tagged with ‘moderate’. Conclusions: We propose an end-to-end two-stage 3D object detector with voxel-to-point module and confidence adjustment module.

Highlights

  • Point-based, voxel-based, and the combination of the two are the mainstream methods for 3D object detection from point clouds

  • The voxel-based methods [7, 8, 9, 10, 11] divide the entire space into regular grids, and project them to BEV and use convolutional neural networks to learn the semantic features of the scene

  • We propose a voxel-to-point module (VTPM) to captures semantic and geometric features for 3D object detection in the point space, which is more conducive to the detection of small-size objects

Read more

Summary

Introduction

Point-based, voxel-based, and the combination of the two are the mainstream methods for 3D object detection from point clouds. The voxel-based methods [7, 8, 9, 10, 11] divide the entire space into regular grids, and project them to BEV and use convolutional neural networks to learn the semantic features of the scene. This paper proposes a novel voxel-point-based combined method named SGNet, which uses a joint loss function to constrain feature learning, and designs a point-based detection head to avoid redundant calculations in the voxel-based method. To solve the learning of discriminative features in the point space based on the combined voxel and point method, and the misalignment between the predicted confidence and the bounding box when selecting RoIs, this paper proposes an end-to-end 3D detector SGNet. The main contributions can be summarized as follows.

Related works
Voxel-point based point cloud feature learning
The misalignment between confidence and bounding box
SGNet for 3D object detection from point clouds
Point cloud encoder
Voxel-to-point module
Confidence adjustment module
Refinement with the combined features
Training loss
Datasets
Implementation Details
Evaluation on KITTI test dataset
Comparison of results on the BEV detection As shown in
Evaluation on KITTI val dataset
Ablation Studies
Position encoder
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call