Abstract

Background: Due to the refinement of region of the interests (RoIs), two-stage 3D detection algorithms can usually obtain better performance compared with most single-stage detectors. However, most two-stage methods adopt feature connection, to aggregate the grid point features using multi-scale RoI pooling in the second stage. This connection mode does not consider the correlation between multi-scale grid features. Methods: In the first stage, we employ 3D sparse convolution and 2D convolution to fully extract rich semantic features. Then, a small number of coarse RoIs are predicted based region proposal network (RPN) on generated bird’s eye view (BEV) map. After that, we adopt voxel RoI-pooling strategy to aggregate the neighborhood nonempty voxel features of each grid point in RoI in the last two layers of 3D sparse convolution. In this way, we obtain two aggregated features from 3D sparse voxel space for each grid point. Next, we design an attention feature fusion module. This module includes a local and a global attention layer, which can fully integrate the grid point features from different voxel layers. Results: We carry out relevant experiments on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset. The average precisions of our proposed method are 88.21%, 81.51%, 77.07% on three difficulty levels (easy, moderate, and hard, respectively) for 3D detection, and 92.30%, 90.19%, 86.00% on three difficulty levels (easy, moderate, and hard, respectively) for BEV detection. Conclusions: In this paper, we propose a novel two-stage 3D detection algorithm named Grid Attention Fusion Region-based Convolutional Neural Network (GAF-RCNN) from point cloud. Because we integrate multi-scale RoI grid features with attention mechanism in the refinement stage, different multi-scale features can be better correlated, achieving a competitive level compared with other well tested detection algorithms. This 3D object detection has important implications for robot and cobot technology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call