Abstract

Remote sensing image object detection and instance segmentation are widely valued research fields. A convolutional neural network (CNN) has shown defects in the object detection of remote sensing images. In recent years, the number of studies on transformer-based models increased, and these studies achieved good results. However, transformers still suffer from poor small object detection and unsatisfactory edge detail segmentation. In order to solve these problems, we improved the Swin transformer based on the advantages of transformers and CNNs, and designed a local perception Swin transformer (LPSW) backbone to enhance the local perception of the network and to improve the detection accuracy of small-scale objects. We also designed a spatial attention interleaved execution cascade (SAIEC) network framework, which helped to strengthen the segmentation accuracy of the network. Due to the lack of remote sensing mask datasets, the MRS-1800 remote sensing mask dataset was created. Finally, we combined the proposed backbone with the new network framework and conducted experiments on this MRS-1800 dataset. Compared with the Swin transformer, the proposed model improved the mask AP by 1.7%, mask APS by 3.6%, AP by 1.1% and APS by 4.6%, demonstrating its effectiveness and feasibility.

Highlights

  • With the continuous advancement of science and technology, remote sensing technology is eagerly developing

  • The extraction of relevant urban metrics is important for characterizing urban typologies, and image segmentation based on deep learning is optimal for the extraction of road features in marginal areas located in urban environments [5]

  • In order to alleviate this problem, we proposed the local perception block (LPB), which is inserted in front of the Swin transformer block

Read more

Summary

Introduction

With the continuous advancement of science and technology, remote sensing technology is eagerly developing. The feature information contained in remote sensing images has become more abundant, and a large amount of valuable information can be extracted from it and used for scientific and technological research. Remote sensing image object detection and segmentation tasks have an important research significance and value for the development of aviation and remote sensing fields, and have broad application prospects in many practical scenarios, such as marine monitoring, ship management and control, and ground urban planning. The extraction of relevant urban metrics is important for characterizing urban typologies, and image segmentation based on deep learning is optimal for the extraction of road features in marginal areas located in urban environments [5]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call