Abstract

This paper proposes a semantic segmentation network which can address the problem of adaptive segmentation for objects with different sizes. In this work, ResNetV2-50 is firstly exploited to extract features of objects, and then these features are fed into the reconstructed feature pyramid network (FPN), which includes multi-scale preference (MSP) module and multi-location preference (MLP) module. Aiming at objects with different sizes, the receptive fields of kernels need to be adjusted. MSP module concatenates feature maps of different receptive fields, and then combines them with the SE block in SE-Net to obtain scale-wise dependencies. In this way, not only multi-scale information can be encoded in feature maps with different degree levels of preference adaptively, but also multi-scale spatial information can be provided to MLP module. The MLP module combines the channels containing more accurate spatial location information with preference to replace traditional nearest interpolation upsampling in FPN. At last, the weighted channels equip with scale-wise information as well as more accurate spatial location information and yield precise semantic prediction for objects with different sizes. We demonstrate the effectiveness of the proposed solutions on the Cityscapes and PASCAL VOC 2012 semantic image segmentation datasets and our methods achieve comparable or higher performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call