Abstract

Semantic segmentation of remote sensing images has been widely used in environmental protection, geological disaster discovery, and natural resource assessment. With the rapid development of deep learning, convolutional neural networks (CNNs) have dominated semantic segmentation, relying on their powerful local information extraction capabilities. Due to the locality of convolution operation, it can be challenging to obtain global context information directly. However, Transformer has excellent potential in global information modeling. This paper proposes a new hybrid convolutional and Transformer semantic segmentation model called CTFuse, which uses a multi-scale convolutional attention module in the convolutional part. CTFuse is a serial structure composed of a CNN and a Transformer. It first uses convolution to extract small-size target information and then uses Transformer to embed large-size ground target information. Subsequently, we propose a spatial and channel attention module in convolution to enhance the representation ability for global information and local features. In addition, we also propose a spatial and channel attention module in Transformer to improve the ability to capture detailed information. Finally, compared to other models used in the experiments, our CTFuse achieves state-of-the-art results on the International Society of Photogrammetry and Remote Sensing (ISPRS) Vaihingen and ISPRS Potsdam datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.