Abstract

The 6-D pose estimation is a crucial task in vision-based measurement for robotic manipulation. It becomes a challenging task because of the variety of lighting conditions, cluttered background, occlusion, and texture-less objects. The various lighting conditions and texture-less objects lead to dramatic changes for imaging. In this article, we propose an edge-attention 6-D pose estimation network (EANet) for texture-less objects that achieve the autonomous perception of the edge, which is invariant when the lighting conditions change and objects are texture-less. To achieve this, EANet adopts a multitask learning strategy that introduces edge cues into the pixelwise dense fusion framework. We design a shared-weight edge extractor serving for edge reconstruction and pose estimation simultaneously. The purpose of edge reconstruction is to guide the network to pay more attention to edges, so as to enhance the performance of pose estimation implicitly. The edge cues can further improve the performance of pose refinement. Furthermore, we apply a skip-connection scheme to the edge extractor, merging feature maps of the deep and shallow layers. Due to the lightweight design, EANet obtains an inference time of 20 frames per second (FPS) on a desktop PC with an RTX 2070 GPU, which satisfies the requirement of most real-time applications. Experiments show that our network has a significant effect on texture-less objects and approaches the state-of-the-art performance on both the LINEMOD and the YCB-Video datasets. Finally, we present a PARTS dataset to test real-world performance and apply EANet to bin-picking vision measurement, which demonstrates that our method can provide a complete and robust solution for vision-based pose measurement.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call