MT-DSSD: Deconvolutional Single Shot Detector Using Multi Task Learning for Object Detection, Segmentation, and Grasping Detection

Ryosuke Araki,Tsubasa Hirakawa,Takayoshi Yamashita,Takeshi Onishi,Hironobu Fujiyoshi

doi:10.1109/icra40945.2020.9197251

Abstract

This paper presents the multi-task Deconvolutional Single Shot Detector (MT-DSSD), which runs three tasks—object detection, semantic object segmentation, and grasping detection for a suction cup—in a single network based on the DSSD. Simultaneous execution of object detection and segmentation by multi-task learning improves the accuracy of these two tasks. Additionally, the model detects grasping points and performs the three recognition tasks necessary for robot manipulation. The proposed model can perform fast inference, which reduces the time required for grasping operation. Evaluations using the Amazon Robotics Challenge (ARC) dataset showed that our model has better object detection and segmentation performance than comparable methods, and robotic experiments for grasping show that our model can detect the appropriate grasping point.

Full Text