Abstract

The aim of this paper is to estimate the six-degree-of-freedom (6DOF) poses of objects from a single RGB image in which the target objects are partially occluded. Most recent studies have formulated methods for predicting the projected two-dimensional (2D) locations of three-dimensional keypoints through a deep neural network and then used a PnP algorithm to compute the 6DOF poses. Several researchers have pointed out the uncertainty of the predicted locations and modelled it according to predefined rules or functions, but the performance of such approaches may still be degraded if occlusion is present. To address this problem, we formulated 2D keypoint locations as probabilistic distributions in our novel loss function and developed a confidence-based pose estimation network. This network not only predicts the 2D keypoint locations from each visible patch of a target object but also provides the corresponding confidence values in an unsupervised fashion. Through the proper fusion of the most reliable local predictions, the proposed method can improve the accuracy of pose estimation when target objects are partially occluded. Experiments demonstrated that our method outperforms state-of-the-art methods on a main occlusion data set used for estimating 6D object poses. Moreover, this framework is efficient and feasible for realtime multimedia applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call