Abstract

Although advancements in red–green–blue-depth (RGB-D)-based six degree-of-freedom (6D) pose estimation methods, severe occlusion remains challenging. Addressing this issue, we propose a novel feature fusion module that can efficiently leverage the color and geometry information in RGB-D images. Unlike prior fusion methods, our method employs a two-stage fusion process. Initially, we extract color features from RGB images and integrate them into a point cloud. Subsequently, an anisotropic separable set abstraction network-like network is utilized to process the fused point cloud, extracting both local and global features, which are then combined to generate the final fusion features. Furthermore, we introduce a lightweight color feature extraction network to reduce model complexity. Extensive experiments conducted on the LineMOD, Occlusion LineMOD, and YCB-Video datasets conclusively demonstrate that our method significantly enhances prediction accuracy, reduces training time, and exhibits robustness to occlusion. Further experiments show that our model is significantly smaller than the latest popular 6D pose estimation models, which indicates that our model is easier to deploy on mobile platforms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.