Abstract

LiDAR-based 3D Object detection is one of the popular topics in recent years, and it is widely used in the fields of autonomous driving and robot controlling. However, due to the scanning pattern of LiDAR, the point clouds of objects at far distance are sparse and more difficult to be detected. To solve this problem, we propose a two-stage network based on spatial context information, named SC-RCNN (Spatial Context RCNN), for object detection in 3D point cloud scenes. SC-RCNN first uses a backbone with sparse convolutions and submanifold sparse convolutions to extract the voxel features of point scenes and generate a series of candidate boxes. For the sparsity of far-distance point clouds, we design the local grid point pooling (LGP Pooling) to extract features and spatial context information around candidate regions for subsequent box refinement. In addition, we propose the pyramid candidate box augmentation (PCB Augmentation) to expand the candidate boxes with a multi-scale style, enriching the feature encoding. The experimental results show that SC-RCNN significantly outperforms previous methods on KITTI dataset and Waymo dataset, and is particularly robust to the sparsity of point clouds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call