Abstract

Category-level pose estimation offers the generalization ability to novel objects unseen during training, which has attracted increasing attention in recent years. Despite the advantage, annotating real-world data with pose label is intricate and laborious. Although using synthetic data with free annotations can greatly reduce training costs, the Synthetic-to-Real (Sim2Real) domain gap could result in a sharp performance decline on real-world test. In this paper, we propose Dual-COPE, a novel prior-based category-level object pose estimation method with dual Sim2Real domain adaptation to avoid expensive real pose annotations. First, we propose an estimation network featured with conjoined prior deformation and transformer-based matching to realize high-precision pose prediction. Upon that, an efficient dual Sim2Real domain adaptation module is further designed to reduce the feature distribution discrepancy between synthetic and real-world data both semantically and geometrically, thus maintaining superior performance on real-world test. Moreover, the adaptation module is loosely coupled with estimation network, allowing for easy integration with other methods without any additional inference overhead. Comprehensive experiments show that Dual-COPE outperforms existing unsupervised methods and achieves state-of-the-art precision under supervised settings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.