Abstract

Vision‐based pose estimation is a basic task in many industrial fields such as bin‐picking, autonomous assembly, and augmented reality. One of the most commonly used pose estimation methods first detects the 2D pose keypoints in the input image and then calculates the 6D pose using a pose solver. Recently, deep learning is widely used in pose keypoint detection and performs excellent accuracy and adaptability. However, its over‐reliance on sufficient and high‐quality samples and supervision is prominent, particularly in the industrial field, leading to high data cost. Based on domain adaptation and computer‐aided‐design (CAD) models, herein, a virtual‐to‐real knowledge transfer method for pose keypoint detection to reduce the data cost of deep learning is proposed. To address the disorder of knowledge flow, a viewpoint‐driven feature alignment strategy is proposed to simultaneously eliminate interdomain differences and preserve intradomain differences. The shape invariance of rigid objects is then introduced as constraints to address the large assumption space problem in the regressive domain adaptation. The multidimensional experimental results demonstrate the superiority of the method. Without real annotations, the normalized pixel error of keypoint detection is reported as 0.033, and the proportion of pixel errors lower than 0.05 is up to 92.77%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call