This study presents six degrees of freedom (6-DoF) pose estimation of an object from a single RGB image and retrieval of the matching CAD model by measuring the similarity between RGB and CAD rendering images. The 6-DoF pose estimation of an RGB object is one of the important techniques in 3D computer vision. However, in addition to 6-DoF pose estimation, retrieval and alignment of the matching CAD model with the RGB object should be performed for various industrial applications such as eXtended Reality (XR), Augmented Reality (AR), robot’s pick and place, and so on. This paper addresses 6-DoF pose estimation and CAD model retrieval problems simultaneously and quantitatively analyzes how much the 6-DoF pose estimation affects the CAD model retrieval performance. This study consists of two main steps. The first step is 6-DoF pose estimation based on the PoseContrast network. We enhance the structure of PoseConstrast by adding variance uncertainty weight and feature attention modules. The second step is the retrieval of the matching CAD model by an image similarity measurement between the CAD rendering and the RGB object. In our experiments, we used 2000 RGB images collected from Google and Bing search engines and 100 CAD models from ShapeNetCore. The Pascal3D+ dataset is used to train the pose estimation network and DELF features are used for the similarity measurement. Comprehensive ablation studies about the proposed network show the quantitative performance analysis with respect to the baseline model. Experimental results show that the pose estimation performance has a positive correlation with the CAD retrieval performance.
Read full abstract