ABSTRACT Multibeam water column images (WCI) play a crucial role in the detection and recognition of underwater targets. However, WCI is not a complete image of underwater space due to the gaps between beams inherent in multibeam systems. In this study, we design the WCI super-resolution network (WCISRN) based on Swin-transformer to adapt to the uneven and sparse distribution of water column data. To enable WCISRN to learn morphological features effectively, we produced simulated datasets for pre-training. Subsequently, we conduct self-supervised training on the measured WCI dataset to fine-tune the network parameters, aiming to better adapt WCISRN to the backscattering intensity features of WCI. Finally, by performing network cross-fusion, the pre-trained network and self-supervised training network are combined to form a new network that achieves a balance between image morphology and intensity features. The proposed method is evaluated through visual analysis and quantitative comparison using simulated and real-world datasets. The experimental results demonstrate that our method exhibits stronger detail restoration ability and clearer image morphology. The average peak signal-to-noise ratio (PSNR) on the test set improved from 23.75–26.14, indicating a significant enhancement in the imaging quality of WCI.