AbstractThis paper proposes a novel convolutional neural network (CNN) architecture designed for semantic segmentation in remote sensing images. The proposed W13 Net model addresses the inherent challenges of segmentation tasks through a carefully crafted architecture, combining the strengths of multistage encoding–decoding, skip connections, combined weighted output, and concatenation techniques. Compared with different segmentation models, the suggested model performs better. A comprehensive analysis of different segmentation models has been carried out, resulting in an extensive comparison between the proposed W13 Net and five existing state-of-the-art segmentation architectures. Utilizing two standardized datasets, the Dense Labeling Remote Sensing Dataset Termed (DLRSD), and the Mohammad Bin Rashid Space Center (MBRSC) Dubai Aerial Imagery Dataset, the evaluation entails training, testing, and validation across different classes. The W13 Net demonstrates adaptability, generalization capabilities, and superior results in key metrics, all while displaying robustness across a variety of datasets. A number of metrics, including accuracy, precision, recall, F1 score, and IOU, were used to evaluate the system’s performance. According to the experimental results, the W13 Net model obtained an accuracy of 87.8%, precision of 0.88, recall of 0.88, F1 score of 0.88, and IOU of 0.74. The suggested model showed a significant improvement in segmentation IOU, with an increase of up to 18%, when compared to other with the recent segmentation models taking into consideration the model’s comparatively low number of parameter (2.2 million) in comparison with the recent models.
Read full abstract