Abstract

Recent advances in 3D object detection typically learn voxel-based or point-based representations on point clouds. Point-based methods preserve precise point positions but incur high computational load, whereas voxel-based methods rasterize unordered points into voxel grids efficiently but give rise to an accuracy bottleneck. To take advantage of voxel-and point-based representations, we develop an effective and efficient 3D object detector via a novel voxel-point geometry abstraction scheme. Our motivation is to use coarse voxel representation to accelerate proposal generation while using precise point representation to facilitate proposal refinement. For voxel representation learning, we propose a context enrichment module with a novel 3D sparse interpolation layer to augment raw points with multi-scale context. We further develop a point-based RoI pooling module with explicit position augmentation for proposal refinement. Extensive experiments on the widely used KITTI Dataset and the latest Waymo Open Dataset show that the proposed algorithm outperforms state-of-the-art point-voxel-based methods while running at 24 FPS on the TITAN XP GPU.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call