Abstract
In the grasping task of industrial robots, multi-target objects are often placed in disorder or even partially occlusion or stacked, which brings certain difficulties to visual detection such as accuracy and real-time. The traditional Mask-RCNN algorithm can achieve high detection accuracy in scene which the target objects are neatly placed, but in the complex scenarios such as disorderly placed or partially occlusion is still have space for improvement in accuracy and speed. Mask-RCNN introduces the mask head structure to achieve pixel level segmentation mask prediction, it achieves high detection accuracy but increases the amount of calculation, this cause the detection speed is limited. To deal the above problems, this paper proposes an indirect frame subtraction for loss function to improved Mask-RCNN, which uses adjacent frames as comparison templates to find differences for image, that is, after one recognition, the previous recognition result is used as the background, and the next change is used as the target. thereby improving the recognition accuracy, reducing the repetitive estimation of regions, and improving the detection accuracy and running speed. Through experiments on self-made datasets, it is proved that new method can improve the image recognition accuracy about 2.3%, another, image recognition time has been reduced by 9% and the FPS value is improved by 6 frames which indicate the speed was improved. The research has important reference significance for the realization of robot flexible grasping task in intelligent manufacturing environment.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have