In the era of big data, making full use of remote sensing images to automatically extract surface water bodies (WBs) in complex environments is extremely challenging. Due to the weak capability of existing algorithms in extracting small WBs and WB edge information from remote sensing images, we proposed a new method—Multiscale Fusion SegFormer (MF-SegFormer)—for WB extraction in the Weihe River Basin of China using Landsat 8 OLI images. The MF-SegFormer method adopts a cascading approach to fuse features output by the SegFormer encoder at multiple scales. A feature fusion (FF) module is proposed to enhance the extraction of WB edge information, while an Atrous Spatial Pyramid Pooling (ASPP) module is employed to enhance the extraction of small WBs. Furthermore, we analyzed the impact of four kinds of band combinations on WB extraction by the MF-SegFormer model, including true color composite images, false color images, true color images, and false color images enhanced by Gaussian stretch. We also compared our proposed method with several different approaches. The results suggested that false color composite images enhanced by Gaussian stretching are beneficial for extracting WBs, and the MF-SegFormer model achieves the highest accuracy across the study area with a precision of 77.6%, recall of 84.4%, F1-score of 80.9%, and mean intersection over union (mIoU) of 83.9%. In addition, we used the determination coefficient (R2) and root-mean-square error (RMSE) to evaluate the performance of river width extraction. Our extraction results in an overall R2 of 0.946 and an RMSE of 28.21 m for the mainstream width in the “Xi’an-Xianyang” section of the Weihe River. The proposed MF-SegFormer method used in this study outperformed other methods and was found to be more robust for WB extraction.
Read full abstract