Abstract

The methods based on deep learning are the mainstream of 6D object pose estimation, which mainly include direct regression and two-stage pipelines. The former are keen by many scholars at first due to their simplicity and differentiability to poses, but they usually lack in accuracy when compared with the latter that estimate the intermediate variables relating to geometries such as object keypoints or 2D-3D correspondence before PnP/RANSAC algorithm. However, the loss function of the two-stage method is non-differentiable to the 6D pose, which is hard to apply in the tasks requiring the differentiable poses. To overcome the disadvantages of the above methods, we propose an end-to-end regression network based on keypoints for 6D pose estimation. Specifically, we supervise the point-wise keypoint offsets that help the network to learn the geometric information and directly regress the 6D pose through aggregating keypoints to achieve differentiability to the pose. Furthermore, we improve the sampling method by sampling points around objects that benefits the small object and design a unit loss function that helps the learning of the keypoints. Experimental results show that our approach outperforms most methods on LM, LM-O and YCB-V datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call