Abstract
The representation of pseudo point cloud can significantly improve the precision of 3D object detection. However, existing pseudo point cloud-based methods typically fuse the processed features through coarse concatenation, which ignores the consistency between the point cloud and pseudo point cloud features. The inconsistency of features in different modal data can lead to detection bias. In this paper, we propose a novel pseudo point cloud-based network called SGF3D, which utilizes a cross-modal attention module cross-modal attention fusion (CMAF) to fuse point cloud and pseudo point cloud features. It can better learn the cross-modal similarity of output features, enabling the detection box to fit better with the target. We also designed a region of interest (RoI) head similarity attention head (SAH) to utilize the overlooked similarity to optimize training without increasing the complexity of the network. By using CMAF and SAH, the proposed method can obtain more accurate bounding boxes. Extensive experiments on KITTI dataset demonstrate that the proposed method can achieve competitive results. Training code and well trained weights are available at https://github.com/ChunZheng2022/SGF3D.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
More From: Image and Vision Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.