Abstract
The semantic segmentation of high-resolution remote sensing images (HRRSIs) faces persistent challenges in handling complex architectural structures and shadow occlusions, limiting the effectiveness of existing deep learning approaches. To address these limitations, we propose an attention-focused feature enhancement network (AFENet) with a novel encoder–decoder architecture. The encoder architecture combines ResNet50 with a parallel multistage feature enhancement group (PMFEG), enabling robust feature extraction through optimized channel reduction, scale expansion, and channel reassignment operations. Building upon this foundation, we develop a global multi-scale attention mechanism (GMAM) in the decoder that effectively synthesizes spatial information across multiple scales by learning comprehensive global–local relationships. The architecture is further enhanced by an efficient feature-weighted fusion module (FWFM) that systematically integrates remote spatial features with local semantic information to improve segmentation accuracy. Experimental results across diverse scenarios demonstrate that AFENet achieves superior performance in building structure detection, exhibiting enhanced segmentation connectivity and completeness compared to state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have