Abstract

Convolutional neural networks have attracted much attention for their use in the semantic segmentation of remote sensing imagery. The effectiveness of semantic segmentation of remote sensing images is significantly influenced by contextual information extraction. The traditional convolutional neural network is constrained by the size of the convolution kernel and mainly concentrates on local contextual information. We suggest a new lightweight global context semantic segmentation network, LightFGCNet, to fully utilize the global context data and to further reduce the model parameters. It uses an encoder–decoder architecture and gradually combines feature information from adjacent encoder blocks during the decoding upsampling stage, allowing the network to better extract global context information. Considering that the frequent merging of feature information produces a significant quantity of redundant noise, we build a unique and lightweight parallel channel spatial attention module (PCSAM) for a few critical contextual features. Additionally, we design a multi-scale fusion module (MSFM) to acquire multi-scale feature target information. We conduct comprehensive experiments on the two well-known datasets ISPRS Vaihingen and WHU Building. The findings demonstrate that our suggested strategy can efficiently decrease the number of parameters. Separately, the number of parameters and FLOPs are 3.12 M and 23.5 G, respectively, and the mIoU and IoU of our model on the two datasets are 70.45% and 89.87%, respectively, which is significantly better than what the conventional convolutional neural networks for semantic segmentation can deliver.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call