Remote-sensing technology has gradually become one of the most important ways to extract sea–land boundaries due to its large scale, high efficiency, and low cost. However, sea–land segmentation (SLS) is still a challenging problem because of data diversity and inconsistency, “different objects with the same spectrum” or “the same object with different spectra”, and noise and interference problems, etc. In this paper, a new sea–land segmentation method (PMFormer) for remote-sensing images is proposed. The contributions are mainly two points. First, based on Mask2Former architecture, we introduce the prompt mask by normalized difference water index (NDWI) of the target image and prompt encoder architecture. The prompt mask provides more reasonable constraints for attention so that the segmentation errors are alleviated in small region boundaries and small branches, which are caused by insufficiency of prior information by large data diversity or inconsistency. Second, for the large intra-class difference problem in the foreground–background segmentation in sea–land scenes, we use deep clustering to simplify the query vectors and make them more suitable for binary segmentation. Then, traditional NDWI and eight other deep-learning methods are thoroughly compared with the proposed PMFormer on three open sea–land datasets. The efficiency of the proposed method is confirmed, after the quantitative analysis, qualitative analysis, time consumption, error distribution, etc. are presented by detailed contrast experiments.