Abstract

Point cloud object detection is gradually playing a key role in autonomous driving tasks. To address the issue of insensitivity to sparse objects in point cloud object detection, we have made improvements to the voxel encoding and 3D backbone network of the PVRCNN++. We have introduced adaptive pooling operations during voxel feature encoding to expand the point cloud information within each voxel, followed by the utilization of multi-layer perceptrons to extract richer point cloud features. On the 3D backbone network, we have employed adaptive sparse convolution operations to make the backbone network’s channel count more flexible, allowing it to accommodate a wider range of input data types. Furthermore, we have integrated Focal Loss to tackle the issue of class imbalance in detection tasks. Experimental results on the public KITTI dataset demonstrate significant improvements over the PVRCNN++, particularly in pedestrian and bicycle detection tasks. Specifically, we have observed 1% increase in detection accuracy for pedestrians and 2.1% improvement for bicycles. Our detection performance also surpasses that of other comparative detection algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call