Abstract

Most U-shaped convolutional neural network (CNN) methods have the problems of insufficient feature extraction and fail to fully utilize global/multi-scale context information, which makes it difficult to distinguish similar objects and shadow occluded objects in remote sensing images. This article proposes a ConvNeXt multi-scale pyramid fusion U-shaped network (CMPF-UNet). In this work, we first propose a novel backbone network based on ConvNeXt to enhance image feature extraction, and use ConvNeXt bottleneck blocks to reconstruct the decoder. Furthermore, a scale aware pyramid fusion (SAPF) module and Residual Atrous Spatial Pyramid Pooling (RASPP) module are proposed to dynamically fuse the rich multi-scale context information in advanced features. Finally, multiple Global Pyramid Guidance (GPG) modules are embedded in the network, aiming to provide different levels of global context information for the decoder by reconstructing skip-connections. Experiments on the Vaihingen and Potsdam datasets indicate that the proposed CMPF-UNet segmentation achieves more accurate results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call