Abstract
Instance segmentation based on deep learning is one of the popular research direction in the field of artificial intelligence. It is widely used in fields such as intelligent manufacturing and autonomous driving. At present, most of the instance segmentation are based on the object detection. The defect of this scheme is that the pixels used to predict the mask cannot fully contain all the information of the target object, resulting in the decline of segmentation accuracy. The accuracy of instance segmentation will affect the reliability of the application that applies computer vision programs. To solve the above problems, the paper proposes a global feature fusion module to improve the prediction accuracy of mask in instance segmentation. GFF module can extract the feature of the whole size feature map directly. GFF module can make up the target information of the pixel corresponding to the receptive field of the feature map that cannot be covered. The actual receptive field is smaller than the theoretical receptive field, which leads to some loss of information in the high-level feature map. The feature map produced by GFF module can also be used as supplementary information to add to the prediction of the mask to solve the problem of the limitation of theoretical receptive field and practical receptive field. At the same time, GFF module only uses one hyperparameter, which reduces the difficulty of network hyperparameter adjustment. It was added to YOLACT to improve the mask prediction accuracy of the mask. Although the structure of YOLACT becomes more complex after adding GFF module, the overall parameters of the network do not change much. The reasoning speed of YOLACT-GFF with resnet50 as the backbone is 20.79 FPS and mAP is 28.3 under the condition of GTX 2080ti.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have