Accurate and rapid weed species identification contributes to selective herbicide spraying or robotic weeding. An image-based method for automatic identification of weed species in paddy fields is highly desirable. However, water reflection, soil background, occlusion, and varying growth and illumination make the development of this method challenging. To address this issue, an improved deep learning model GTCBS-YOLOv5s was proposed to identify six weed species in paddy fields, where Ghost, C3Trans and convolutional block attention module (CBAM) were employed to improve weed feature extraction in complex environments. The bidirectional feature pyramid network (BiFPN) coupled with Concat structure was introduced in the Neck network to achieve the multi-scale feature fusion for identifying various weed species. Three different output feature maps in Detect network were utilized to identify weeds of varying sizes. A more comprehensive scale-sensitive intersection over union (SIoU) loss function was adopted to eliminate the redundant generating boxes. The results showed that GTCBS-YOLOv5s model achieved a mean average precision (mAP) of 91.1 % for the test set, and the identification speed reached 85.7 FPS. Robustness tests demonstrated that GTCBS-YOLOv5s obtained satisfactory performance in identifying weeds under various lighting conditions, with precision (P), recall (R) and mAP all greater than 85 %. Occluded weeds were identified with P, R and average precision (AP) greater than 89.8 %, 90.1 % and 90.3 %, respectively. Furthermore, GTCBS-YOLOv5s had good performance in identifying weeds at different growth stages, with P, R and mAP higher than 90.1 %, 89.5 %, and 90.3 % respectively. Compared with the state-of-the-art models, GTCBS-YOLOv5s was highly promising for deployment to the embedded devices for real-time field detection due to its high accuracy, lightweight and robust attributes, as well as fast inference.