Abstract

The ability of understanding a scene and predicting the pose of objects has attracted significant interests in recent years. Specifically, it is used with visual sensors to provide the information for a robotic manipulator to interact with the target. Thus, 6D pose estimation and object recognition from point clouds or RGB-D images are important tasks for visual servoing. In this article, we propose a learning based approach to perform 6D pose estimation for robotic manipulation using a BiLuNetICP pipeline. It consists of a multi-path convolutional neural network (CNN) for semantic segmentation on RGB images. The network extracts the object mask and uses it to merge with the depth information to perform 6D pose estimation by the Iterative Closest Point (ICP) algorithm. We collected our own dataset for training and evaluate with Intersection over Union (IoU). The proposed method is able to provide better results compared with Unet++ using a small amount of training data. For the robotic grasping application, we test and evaluate our approach using a HIWIN 6-axis robot with Asus Xtion Live 3D camera and our structured-light depth camera. The experimental results demonstrate its efficiency in computation and the high success rate in grasping.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call