Transmission line fault detection using drones provides real-time assessment of the operational status of transmission equipment, and therefore it has immense importance in ensuring stable functioning of the transmission lines. Currently, identification of transmission line equipment relies predominantly on manual inspections that are susceptible to the influence of natural surroundings, resulting in sluggishness and a high rate of false detections. In view of this, in this study, we propose an insulator defect recognition algorithm based on a YOLOv5 model with a new lightweight network as the backbone network, combining noise reduction and target detection. First, we propose a new noise reduction algorithm, i.e., the adaptive neighborhood-weighted median filtering (NW-AMF) algorithm. This algorithm employs a weighted summation technique to determine the median value of the pixel point’s neighborhood, effectively filtering out noise from the captured aerial images. Consequently, this approach significantly mitigates the adverse effects of varying noise levels on target detection. Subsequently, the RepVGG lightweight network structure is improved to the newly proposed lightweight structure called RcpVGG-YOLOv5. This structure facilitates single-branch inference, multi-branch training, and branch normalization, thereby improving the quantization performance while simultaneously striking a balance between target detection accuracy and speed. Furthermore, we propose a new loss function, i.e., Focal EIOU, to replace the original CIOU loss function. This optimization incorporates a penalty on the edge length of the target frame, which improves the contribution of the high-quality target gradient. This modification effectively addresses the issue of imbalanced positive and negative samples for small targets, suppresses background positive samples, and ultimately enhances the accuracy of detection. Finally, to align more closely with real-world engineering applications, the dataset utilized in this study consists of machine patrol images captured by the Unmanned Aerial Systems (UAS) of the Yunnan Power Supply Bureau Company. The experimental findings demonstrate that the proposed algorithm yields notable improvements in accuracy and inference speed compared to YOLOv5s, YOLOv7, and YOLOv8. Specifically, the improved algorithm achieves a 3.7% increase in accuracy and a 48.2% enhancement in inference speed compared to those of YOLOv5s. Similarly, it achieves a 2.7% accuracy improvement and a 33.5% increase in inference speed compared to those of YOLOv7, as well as a 1.5% accuracy enhancement and a 13.1% improvement in inference speed compared to those of YOLOv8. These results validate the effectiveness of the proposed algorithm through ablation experiments. Consequently, the method presented in this paper exhibits practical applicability in the detection of aerial images of transmission lines within complex environments. In future research endeavors, it is recommended to continue collecting aerial images for continuous iterative training, to optimize the model further, and to conduct in-depth investigations into the challenges associated with detecting small targets. Such endeavors hold significant importance for the advancement of transmission line detection.