Coastal aquaculture plays a crucial role in global food security and the economic development of coastal regions, but it also causes environmental degradation in coastal ecosystems. Therefore, the automation, accurate extraction, and monitoring of coastal aquaculture areas are crucial for the scientific management of coastal ecological zones. This study proposes a novel deep learning- and attention-based median adaptive fusion U-Net (MAFU-Net) procedure aimed at precisely extracting individually separable aquaculture ponds (ISAPs) from medium-resolution remote sensing imagery. Initially, this study analyzes the spectral differences between aquaculture ponds and interfering objects such as saltwater fields in four typical aquaculture areas along the coast of Liaoning Province, China. It innovatively introduces a difference index for saltwater field aquaculture zones (DIAS) and integrates this index as a new band into remote sensing imagery to increase the expressiveness of features. A median augmented adaptive fusion module (MEA-FM), which adaptively selects channel receptive fields at various scales, integrates the information between channels, and captures multiscale spatial information to achieve improved extraction accuracy, is subsequently designed. Experimental and comparative results reveal that the proposed MAFU-Net method achieves an F1 score of 90.67% and an intersection over union (IoU) of 83.93% on the CHN-LN4-ISAPS-9 dataset, outperforming advanced methods such as U-Net, DeepLabV3+, SegNet, PSPNet, SKNet, UPS-Net, and SegFormer. This study’s results provide accurate data support for the scientific management of aquaculture areas, and the proposed MAFU-Net method provides an effective method for semantic segmentation tasks based on medium-resolution remote sensing images.