CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

Ning Li,Xiaopeng Yu,Miao Yu

doi:10.1080/10106049.2024.2311217

Abstract

Most U-shaped convolutional neural network (CNN) methods have the problems of insufficient feature extraction and fail to fully utilize global/multi-scale context information, which makes it difficult to distinguish similar objects and shadow occluded objects in remote sensing images. This article proposes a ConvNeXt multi-scale pyramid fusion U-shaped network (CMPF-UNet). In this work, we first propose a novel backbone network based on ConvNeXt to enhance image feature extraction, and use ConvNeXt bottleneck blocks to reconstruct the decoder. Furthermore, a scale aware pyramid fusion (SAPF) module and Residual Atrous Spatial Pyramid Pooling (RASPP) module are proposed to dynamically fuse the rich multi-scale context information in advanced features. Finally, multiple Global Pyramid Guidance (GPG) modules are embedded in the network, aiming to provide different levels of global context information for the decoder by reconstructing skip-connections. Experiments on the Vaihingen and Potsdam datasets indicate that the proposed CMPF-UNet segmentation achieves more accurate results.

Full Text