Traditional dehazing approaches that rely on prior knowledge exhibit limited efficacy when confronted with the intricacies of real-world hazy environments. While learning-based dehazing techniques necessitate large-scale datasets for effective model training, the acquisition of these datasets is time-consuming and laborious, and the resulting models may encounter a domain shift when processing real-world hazy images. To overcome the limitations of prior-based and learning-based dehazing methods, we propose a self-supervised remote sensing (RS) image-dehazing network based on zero-shot learning, where the self-supervised process avoids dense dataset requirements and the learning-based structures refine the artifacts in extracted image priors caused by complex real-world environments. The proposed method has three stages. The first stage involves pre-processing the input hazy image by utilizing a prior-based dehazing module; in this study, we employed the widely recognized dark channel prior (DCP) to obtain atmospheric light, a transmission map, and the preliminary dehazed image. In the second stage, we devised two convolutional neural networks, known as RefineNets, dedicated to enhancing the transmission map and the initial dehazed image. In the final stage, we generated a hazy image using the atmospheric light, the refined transmission map, and the refined dehazed image by following the haze imaging model. The meticulously crafted loss function encourages cycle-consistency between the regenerated hazy image and the input hazy image, thereby facilitating a self-supervised dehazing model. During the inference phase, the model undergoes training in a zero-shot manner to yield the haze-free image. These thorough experiments validate the substantial improvement of our method over the prior-based dehazing module and the zero-shot training efficiency. Furthermore, assessments conducted on both uniform and non-uniform RS hazy images demonstrate the superiority of our proposed dehazing technique.