Single RGB Image 6D Object Grasping System Using Pixel-Wise Voting Network.

Zhongjie Zhang,Jiamao Li,Yasuharu Koike,Chengzhe Zhou

doi:10.3390/mi13020293

Abstract

A robotic system that can autonomously recognize object and grasp it in a real scene with heavy occlusion would be desirable. In this paper, we integrate the techniques of object detection, pose estimation and grasping plan on Kinova Gen3 (KG3), a 7 degrees of freedom (DOF) robotic arm with a low-performance native camera sensor, to implement an autonomous real-time 6 dimensional (6D) robotic grasping system. To estimate the object 6D pose, the pixel-wise voting network (PV-net), is applied in the grasping system. However, the PV-net method can not distinguish the object from its photo through only RGB image input. To meet the demands of a real industrial environment, a rapid analytical method on a point cloud is developed to judge whether the detected object is real or not. In addition, our system shows a stable and robust performance in different installation positions with heavily cluttered scenes.

Highlights

By reprojecting the object model into the pose estimation space, we improve the performance of the pixel-wise voting network (PV-net) so that PV-net can operate properly under a real working scene of lighting interference and low resolution
We illustrate the performance of PV-net, reprojection judgment and the entire system
PV-net performs robustly most of the time

Summary

Introduction

We expect a robotic system that can autonomously search the desired object, estimate its pose, grasp it, and move it to its target position. These kinds of systems can meet human demand in aspects such as packaging logistics, industrial production, medical services, etc. With the advantage of abundant information, strong robustness and low cost, computer vision techniques are increasingly being applied in intelligent systems. A robotic grasping system contains detection, planning and controlling parts. The planning and controlling parts are well developed, while the detection parts have presented a challenge in recent years [1]. A major problem with detection occurs because strategies based on vision methods are sensitive to the actual working scene. It still is a challenge to grasp a texture-less irregular object placed in a heavy-occlusion scene

Methods

Results

Conclusion