Abstract

6D pose estimation is widely used in robot tasks such as sorting and grasping. RGB-D-based methods have recently attained brilliant success, but they are still susceptible to heavy occlusion. Our critical insight is that color and geometry information in RGBD images are two complementary data, and the crux of the pose estimation problem under occlusion is fully leveraging them. Towards this end, we propose a new color and geometry feature fusion module that can efficiently leverage two complementary data sources from RGB-D images. Unlike prior fusion methods, we conduct a two-stage fusion strategy to do color-depth fusion and local-global fusion successively. Specifically, we fuse the color features extracted from RGB images into the point cloud in the first stage. In the second stage, we extract local and global features from the fused point cloud using an ASSANet-like network and splice them together to obtain the final fusion features. We conducted experiments on the widely used LineMod and YCB-Video datasets, which shows that our method improves the prediction accuracy while reducing the training time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.