Abstract
ABSTRACT Remote sensing Change Detection (CD) involves identifying changing regions of interest in bi-temporal remote sensing images. CD technology has rapidly developed in recent years through the powerful learning ability of Convolutional Neural Networks (CNN), affording complex feature extraction. However, the local receptive fields in the CNN limit modeling long-range contextual relationships in semantic changes. Therefore, this work explores the great potential of Siamese Transformers in CD tasks and proposes a general CD model entitled STCD that relies on Swin Transformers. In the encoding process, pure Transformers without CNN are used to model the long-range context of semantic tokens, reducing computational overhead and improving model efficiency compared to current methods. During the decoding process, the 3D convolution block obtains the changing features in the time series and generates the predicted change map in the deconvolution layer with axial attention. Extensive experiments on three binary CD datasets and one semantic CD dataset demonstrate that the proposed STCD model outperforms several popular benchmark methods considering performance and the required parameters. Among the STCD variants, the F1-Score of the Base-STCD on the three binary CD datasets LEVIR, DSIFN, and SVCD reached 89.85%, 54.72%, and 93.75%, respectively, and the mF1-Score and mIoU on the semantic CD dataset SECOND were 75.60% and 66.19%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.