To address the challenges of missed and false detections in citrus fruit detection caused by environmental factors such as leaf occlusion, fruit overlap, and variations in natural light in hilly and mountainous orchards, this paper proposes a citrus detection model based on an improved YOLOv5 algorithm. By introducing receptive field convolutions with full 3D weights (RFCF), the model overcomes the issue of parameter sharing in convolution operations, enhancing detection accuracy. A focused linear attention (FLA) module is incorporated to improve the expressive power of the self-attention mechanism while maintaining computational efficiency. Additionally, anchor boxes were re-clustered based on the shape characteristics of target objects, and the boundary box loss function was improved to Foal-EIoU, boosting the model’s localization ability. Experiments conducted on a citrus fruit dataset labeled using LabelImg, collected from hilly and mountainous areas, showed a detection precision of 95.83% and a mean average precision (mAP) of 79.68%. This research not only significantly improves detection performance in complex environments but also provides crucial data support for precision tasks such as orchard localization and intelligent picking, demonstrating strong potential for practical applications in smart agriculture.