Abstract

3D object detection from LiDAR point clouds has gained great attention in recent years due to its wide applications in smart cities and autonomous driving. Cascade framework shows its advancement in 2D object detection but is less investigated in 3D space. Conventional cascade structures use multiple <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">separate</i> sub-networks to sequentially refine region proposals. Such methods, however, have limited ability to measure proposal quality in all stages, and hard to achieve a desirable performance improvement in 3D space. This paper proposes a new cascade framework, termed CasA, for 3D object detection from LiDAR point clouds. CasA consists of a Region Proposal Network (RPN) and a Cascade Refinement Network (CRN). In CRN, we designed a new Cascade Attention Module that uses multiple sub-networks and attention modules to aggregate the object features from different stages and progressively refine region proposals. CasA can be integrated into various two-stage 3D detectors and improve their performance. Extensive experiments on KITTI and Waymo datasets with various baseline detectors demonstrate the universality and superiority of our CasA. In particular, based on one variant of Voxel-RCNN, we achieve state-of-the-art results on the KITTI dataset. On the KITTI online 3D object detection leaderboard, we achieve a high detection performance of 83.06%, 47.09%, and 73.47% Average Precision (AP) in the moderate Car, Pedestrian, and Cyclist classes, respectively. Code is available at https://github.com/hailanyi/CasA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call