Abstract
Currently, multi-view fusion methods fuse point- or proposal-level features from different views at the end stage of the backbone. This once-end-fusion method is not conducive to the timely adjustment of spatial misalignment for features from different views. Consequently, the discriminative depth and orientation details of the 3D oriented point cloud object may be filtered. To enhance the feature capture capability of the network, we introduce a cascaded multi-3D-view fusion method (CM3DV) to learn the implicit representation of object orientation. In particular, the proposed CM3DV method incorporates the cylindrical front view projection into a voxelised 3D bird’s-eye-view representation in a cascaded manner, and vice versa. Through the learning of 3D-regulated instance representation, this bi-directional mutual fusion module, called cascaded multi-view feature fusion module, alleviates the spatial misalignment of the two views. Furthermore, to learn the rotation- and shape-invariant features of objects, modulated rotation head (MRH) develops a direction-guided adjustment instead of an axis-aligned structure to extract instance-consistent features. By excluding the irrelevant content using MRH, this instance-consistent feature will benefit the object classification and orientation regression. Extensive experiments on the KITTI dataset show that the proposed method achieves a significant improvement over existing advanced methods, especially for orientation estimation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.