Abstract

The point clouds scanned by lidar are generally sparse, which can result in fewer sampling points of objects. To perform precise and effective 3D object detection, it is necessary to improve the feature representation ability to extract more feature information of the object points. Therefore, we propose an adaptive feature enhanced 3D object detection network based on point clouds (AFE-RCNN). AFE-RCNN is a point-voxel integrated network. We first voxelize the raw point clouds and obtain the voxel features through the 3D voxel convolutional neural network. Then, the 3D feature vectors are projected to the 2D bird’s eye view (BEV), and the relationship between the features in both spatial dimension and channel dimension is learned by the proposed residual of dual attention proposal generation module. The high-quality 3D box proposals are generated based on the BEV features and anchor-based approach. Next, we sample key points from raw point clouds to summarize the information of the voxel features, and obtain the key point features by the multi-scale feature extraction module based on adaptive feature adjustment. The neighboring contextual information is integrated into each key point through this module, and the robustness of feature processing is also guaranteed. Lastly, we aggregate the features of the BEV, voxels, and point clouds as the key point features that are used for proposal refinement. In addition, to ensure the correlation among the vertices of the bounding box, we propose a refinement loss function module with vertex associativity. Our AFE-RCNN exhibits comparable performance on the KITTI dataset and Waymo open dataset to state-of-the-art methods. On the KITTI 3D detection benchmark, for the moderate difficulty level of the car and the cyclist classes, the 3D detection mean average precisions of AFE-RCNN can reach 81.53% and 67.50%, respectively.

Highlights

  • In recent years, with the rapid development of intelligent driving technology, it is necessary to improve the performance of 3D object detection, as 3D object detection is a key technology of intelligent driving

  • We first test AFE-RCNN on the KITTI dataset and compare it with the state-of-art methods, we conduct ablation experiments to verify the effectiveness of each part for the proposed AFE-RCNN

  • We conduct experiments on the Waymo dataset to prove the robustness of AFE-RCNN

Read more

Summary

Introduction

With the rapid development of intelligent driving technology, it is necessary to improve the performance of 3D object detection, as 3D object detection is a key technology of intelligent driving. The point-based method uses the multi-layer perceptron (MLP) [3] and sets abstraction to process raw point clouds This kind of method usually has high detection precision, such as in refs. Voxel-based methods generally convert the unstructured point clouds into a 3D voxel or a 2D bird’s eye view grid, such as in Remote Sens. In order to solve the above problems, we improve the PV-RCNN from the aspects of point cloud feature extraction, the RPN, and the loss function of the box proposal regression. We design a residual of dual attention proposal generation module, i.e., RDA module This module learns the correlation of features in both channel branch and spatial branch, while reducing the loss of the information transmission process. The proposed module uses the multi-scale feature extraction method to enhance the robustness of sparse point cloud features. This module constructs a regression loss function based on the projection of the 3D detection box into the BEV coordinate system and the DIoU loss

Related Work
AFE-RCNN for Point Cloud Object Detection
Multi-Scale
Refinement Loss Function Module with Vertex Associativity
Training Losses
Experiments and Results
Dataset and Implementation Details
Evaluation on the KITTI Online Test Server
Method
Ablation Experiments Based on KITTI Validation Set
Evaluation on the KITTI Validation Set
Qualitative Analysis on the KITTI Dataset
Visualization results
Validation on the Waymo Open Dataset
Efficiency and Robustness Analysis of the Proposed Algorithm
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.