Time is a critical factor in maritime Search And Rescue (SAR) missions, during which promptly locating survivors is paramount. Unmanned Aerial Vehicles (UAVs) are a useful tool with which to increase the success rate by rapidly identifying targets. While this task can be performed using other means, such as helicopters, the cost-effectiveness of UAVs makes them an effective choice. Moreover, these vehicles allow the easy integration of automatic systems that can be used to assist in the search process. Despite the impact of artificial intelligence on autonomous technology, there are still two major drawbacks to overcome: the need for sufficient training data to cover the wide variability of scenes that a UAV may encounter and the strong dependence of the generated models on the specific characteristics of the training samples. In this work, we address these challenges by proposing a novel approach that leverages computer-generated synthetic data alongside novel modifications to the You Only Look Once (YOLO) architecture that enhance its robustness, adaptability to new environments, and accuracy in detecting small targets. Our method introduces a new patch-sample extraction technique and task-specific data augmentation, ensuring robust performance across diverse weather conditions. The results demonstrate our proposal’s superiority, showing an average 28% relative improvement in mean Average Precision (mAP) over the best-performing state-of-the-art baseline under training conditions with sufficient real data, and a remarkable 218% improvement when real data is limited. The proposal also presents a favorable balance between efficiency, effectiveness, and resource requirements.