Abstract
ABSTRACT Remote sensing image semantic segmentation methods have become the main approach for extracting cropland information. However, in the mountainous regions of southwestern China, croplands exhibit narrow and fragmented shapes, as well as complex planting patterns, making it difficult for traditional semantic segmentation methods to accurately delineate fine-grained cropland boundaries. To address these challenges, a multiattention Transformer network named MATNet is proposed in this paper, for fine-grained extraction of cropland at the parcel level in complex scenes. MATNet built upon the fusion of CNN encoder and Transformer decoder. In the encoder, spatial and channel reconstruction units are introduced, reducing information redundancy in the convolutional layers. The Transformer decoder incorporates multiple attention mechanisms, this design feature enhances the attention window's perception of local content and improves the model's ability to extract features from fine-grained cropland parcels through optimized computationnal al location. Taking the experimental results of the Dali cropland dataset as an illustration, MATNet achieved the highest values across five evaluation metrics, including mIoU. Specifically, the Recall, F1, and mIoU scores were 94.68%, 94.69%, and 89.92%, respectively. Compared with six other advanced models, MATNet consistently performed best in terms of extracting fine-grained cropland parcels.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.