Abstract

Current point-voxel fusion methods for 3D object detection could not make full use of complementary information in the field of autonomous driving. Therefore, a novel two-stage 3D object detection method, called Accelerating Point-Voxel Representation (APVR), is proposed. The advantages of Point-based feature and Voxel-based feature can be integrated into a single 3D representation. Thereby, the proposed method retains more fine-grained information of an object while maintaining high efficiency. Specifically, computational cost is reduced by adding offsets to query neighboring voxels of key-points. More fine-grained information can be obtained by calculating the matching probability between neighbouring voxels and key-points. During the optimization of the prediction boxes, virtual grid points are set to capture the spatial information between key-points. The constraint of minimum enclosing rectangle is also added to optimize the directions of the prediction boxes. A large number of experiments on the KITTI, NuScenes and Waymo datasets demonstrate great generalizability and portability of the proposed approach. The effectiveness and efficiency of APVR has been proved by comparisons with the state-of-art methods. APVR makes the real-time processing frame rate reach 40.4 Hz while ensuring high prediction accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call