Abstract

• Vertical distribution characteristics (VDCs) of point clouds are robust to point-sparsity and provide semantic information. • Decorating the pillars with VDCs and semantic label is conducive to robust and discriminative feature extraction. • The adaptive object augmentation (AOA) paradigm overcomes the conflict between augmented objects and corresponding scenes. • 3D-VDNet significantly outperforms the baseline in all classes (Car, Pedestrian, and Cyclist). • The AOA paradigm with a stable performance across detector types (voxel- or point-based). Accurate 3D object detection is limited by the sparsity of LiDAR-based point clouds. The vertical distribution characteristics (VDCs) of point clouds in pillars are robust to point-sparsity and provide informative semantic information on objects. Based on this, we propose a novel 3D object detection framework where the VDCs of point clouds are exploited to optimize feature extraction and object augmentation. More specifically, a Spatial Feature Aggregation module is proposed to perform robust feature extraction by decorating pillars with the VDCs. To spatially enhance semantic embeddings, we employ VDCs to construct a voxelized semantic map, acting as an additional input stream. Moreover, we develop an Adaptive Object Augmentation (AOA) paradigm, which adopts the VDC searching of suitable ground regions to “paste” virtual objects, thus avoiding conflicts with new scenes. Extensive experiments on the KITTI dataset demonstrate that our framework can significantly outperform the baseline, achieving 3.74%/1.59% moderate AP improvements on the Car 3D/BEV benchmarks with 38 FPS inference speed. Furthermore, we prove the stable performance of our AOA module across different detectors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call