Abstract

3D object detection aims to study how to perceive environmental information effectively, classify and locate interested objects accurately. In order to solve the problem that the object is easy to be lost in complex environments (such as partial occlusion, at long distance), we proposed a multi-scale 3D object detection method based on feature pyramid network (FPN). Firstly, input RGB image corresponding to the scene and bird's eye view (BEV) map into the pyramid feature extractor to construct multi-scale and strong semantic feature representations. Secondly, priori anchor boxes are applied to each layer of the feature pyramid, and the regional fusion features are obtained and input to the classifier and regressor. Finally, cross-scale detection is completed to obtain the best classification, dimensions and orientation estimation results. The experiments are conducted on the KITTI dataset and the results show that FPN-based feature extraction and cross-scale detection method can effectively improve the detection ability of partial occluded and long-distance objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call