Abstract

For fine-scale prediction tasks such as semantic segmentation, existing segmentation models cannot support detailed segmentation due to the difficulty of assigning deep feature semantics generated by the encoder to shallow features, thus making the segmentation of details ambiguous in semantic segmentation scenarios. In addition, high-precision models often require large quantities of computational resources. To solve the above problems, we design an efficient cross-information fusion decoder (ECFD). In the ECFD, we design a cross-information fusion block (CFB), and contextual information is used to assign semantic information to the shallow features in spatial domain, thus facilitating the classification of the details of segmented objects. To reduce the computational effort of the model, we choose the same decoder structure as used by the efficient SenFormer: the feature pyramid structure. Compared with SenFormer, ECFD-Swin-Large reduces the numbers of parameters and floating-point operations by 1/3, and achieves 83.61% and 64.98% of mIoU values for the benchmark datasets Cityscapes and Pascal Context, respectively, outperforming SenFormer, especially for in detailed segmentation. In addition, 69.19% is obtained on BDD100K. The code is publicly available at https://github.com/songyang-xiaobai/ECFD-main.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.