Abstract

3D object detection is crucial to ensure the reliability and stability of autonomous driving systems. In recent years, researchers have made great progress in 3D object detection by combining features of images and point clouds. However, there is still much room for detection accuracy improvement, especially for small object detection. In this paper, we propose a frustum-based 3D object detection model named PointFPN. The core idea of PointFPN is learning expressive semantic and contextual information for small objects, e.g., pedestrians and cyclists in an urban street scene. To detect 3D objects, our model uses frustums to bridge the gap between images and point clouds and thus generating proposals. Then, a feature pyramid structure is designed to extract and fuse multi-level features of target objects represented by point clouds. Meanwhile, we develop a multilevel regression network to calculate different parameters of 3D bounding boxes at different feature levels. Through elaborate structure designed above, our model can learn discriminative features which are highly relevant to bounding box parameters at different feature levels. Experimental study shows that our model is effective in detecting small objects and has a strong robustness to sparse point clouds. Our model demonstrates state-of-the-art performance on small object detection on KITTI benchmark.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call