Abstract

Autonomous robotic grasping is an essential skill for service robots to perform specified tasks in unstructured scenarios. Previous work focus on simple pick-and-place tasks, and it is not satisfactory for real-world scenes that have requirements for manipulation. In this paper, we present a modular intelligent robot architecture via multi-task convolutional neural network which can be used for specific object grasping and manipulation in a stacked and cluttered environment. Firstly, an end-to-end, multi-task semantic grasping convolutional neural network (MSG-ConvNet) that simultaneously outputs the results of grasp detection and semantic segmentation is proposed to recognize the affiliations between objects and grasps in cluttered scenarios. Secondly, we propose a post-processing method which allows the robot to select an optimal grasping area in an active perception way through simply reasoning on the multi-modal information output by the proposed model. The proposed multi-task network has a great improvement in both recognition accuracy and detection speed on the public multi-object dataset GraspNet-1Billion compared with the benchmark. The proposed grasp detection method also yields state-of-the-art performance with accuracies of 95.06% and 98.6% on the public single-object Jacquard Dataset and Cornell Dataset, respectively. In addition, the experiments in a real-world scene demonstrate that our proposed method has stronger robustness and adaptability than the simple direct grasping strategy in the environment with higher mutual occlusion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call