Abstract

6D pose estimation from a single RGB image is a fundamental task in computer vision. We introduce a two-stage 6D pose estimation method for texture-less objects. Instead of utilizing the object mask in almost current monocular methods, we propose an edge representation for texture-less objects. An object is represented by the combination of visible edges corresponding to the object’s 6D pose, allowing the neural network to focus more on the object’s invariant global shape and structure, rather than indistinguishable local patches with image noise and similar texture. Given an RGB image, the proposed method predicts the direction and distance to a certain object keypoint from all object pixels within the range of object edge representation, establishes voting-based sparse 2D-3D correspondences, and solves the object pose with PnP algorithm. In the experiments, we found that directly replacing the object mask with the edge representation can bring a performance improvement in two current two-stage pipelines without any modification. Further evaluations on three different benchmark datasets containing symmetric and occluded objects show our method outperforms the state-of-the-art methods using RGB images only.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call