Abstract
Due to the difficulty in generating a 6-Degree-of-Freedom (6-DoF) object pose estimation dataset, and the existence of domain gaps between synthetic and real data, existing pose estimation methods face challenges in improving accuracy and generalization. This paper proposes a methodology that employs higher quality datasets and deep learning-based methods to reduce the problem of domain gaps between synthetic and real data and enhance the accuracy of pose estimation. The high-quality dataset is obtained from Blenderproc and it is innovatively processed using bilateral filtering to reduce the gap. A novel attention-based mask region-based convolutional neural network (R-CNN) is proposed to reduce the computation cost and improve the model detection accuracy. Meanwhile, an improved feature pyramidal network (iFPN) is achieved by adding a layer of bottom-up paths to extract the internalization of features of the underlying layer. Consequently, a novel convolutional block attention module-convolutional denoising autoencoder (CBAM-CDAE) network is proposed by presenting channel attention and spatial attention mechanisms to improve the ability of AE to extract images' features. Finally, an accurate 6-DoF object pose is obtained through pose refinement. The proposed approach is compared to other models using the T-LESS and LineMOD datasets. Comparison results demonstrate the proposed approach outperforms the other estimation models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.