Abstract

Six-degree (6D) pose estimation of objects is important for robot manipulation but at the same time challenging when dealing with occluded and textureless objects. To overcome this challenge, the proposed method presents an end-to-end robust network for real-time 6D pose estimation of rigid objects using the RGB image. In this proposed method, a fully convolutional network with a features pyramid is developed that effectively boosts the accuracy of pixelwise labeling and direction unit vector field that take part in the voting process for object keypoints estimation. The network further takes into account measuring the distance between pixel and keypoint, which aims to help select accurate hypotheses in the RANSAC process. This avoids hypothesis deviations caused by the errors due to direction unit vectors in cases of distant pixels from keypoints. A vectorial distance regularization loss function is used to help Perspective-n-Point find 2D-3D correspondences between 3D object keypoints and their estimated corresponding 2D counterparts. Experiments are performed on widely used LINEMOD and occlusion LINEMOD datasets with ADD (-S) and 2D projection evaluation metrics. The results show that our method improves pose estimation performance compared to the state-of-the-art while still achieving real-time efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call