Abstract

Category-level object 6D pose estimation is essential for robotic manipulation, augmented reality and 3D scene understanding. It aims to accurately predict the translation and rotation of arbitrary shape instances from a given set of object classes without models of each instance. However, such estimation is cumbersome owing to the intra-class variation and the difficulty of accurately regressing dense correspondences between an observed point cloud and the reconstructed instance model from high-order and complex features. In this study, we propose a framework comprising geometry-guided instance-aware prior and multi-stage reconstruction networks to address these challenges. The geometry-guided instance-aware prior network can avoid unstable RGB data interference and robustly learn implicit relations on semantic and geometric information between the observed point cloud and prior. We take advantage of these potential relations using a transformer network to cope with the intra-class variation. Furthermore, the proposed multi-stage reconstruction network is designed to learn the residual of the preliminary reconstruction and improve the accuracy of predicting dense correspondences originating from the complex feature. The results of extensive experiments on a well-acknowledged benchmark for category-level 6D pose estimation demonstrate that our proposed method achieves a significant improvement in performance compared to previous methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call