Abstract

3D object detection is a fundamental technique in autonomous driving. However, current LiDAR-based single-stage 3D object detection algorithms do not pay sufficient attention to the encoding of the inhomogeneity of LiDAR point clouds and the shape encoding of each object. This paper introduces a novel 3D object detection network called the spatial and part-aware aggregation network (SPANet), which utilizes a spatial aggregation network to remedy the inhomogeneity of LiDAR point clouds, and embodies a part-aware aggregation network that learns the statistic shape priors of objects. SPANet deeply integrates both 3D voxel-based features and point-based spatial features to learn more discriminative point cloud features. Specifically, the spatial aggregation network takes advantage of the efficient learning and high-quality proposals by providing flexible receptive fields from PointNet-based networks. The part-aware aggregation network includes a part-aware attention mechanism that learns the statistic shape priors of objects to enhance the semantic embeddings. Experimental results reveal that the proposed single-stage method outperforms state-of-the-art single-stage methods on the KITTI 3D object detection benchmark. It achieved a bird’s eye view (BEV) average precision (AP) of 91.59%, 3D AP of 80.34%, and heading AP of 95.03% in the detection of cars.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call