In recent years, remote sensing has experienced a significant transformation due to rapid advancements in deep learning technology, which have greatly outpaced traditional methodologies. This integration has attracted substantial interest within the academic community. To address the complex challenges of extracting data on intricate water bodies during disaster scenarios, this study developed a post-disaster floodwater body dataset and an enhanced multi-scale transformer model architecture. Through end-to-end training, the precision of the model in extracting floodwater contours has been significantly improved. Additionally, by utilizing the vast amounts of unannotated data in remote sensing through an unsupervised pre-training task, the model’s backbone network has been fortified, greatly enhancing its performance in remote sensing applications. Experimental analyses have shown that the multi-scale transformer-based algorithm for floodwater contour extraction proposed in this study is not only widely applicable but also excels in delivering precise segmentation results in complex environments. This refined approach ensures that the model adeptly handles the intricacies of floodwater body delineation, providing a robust solution for accurate extraction, even in disaster-stricken areas. This innovation represents a substantial leap forward in remote sensing, offering valuable insights and tools for disaster management and environmental monitoring.