X-ray security images face significant challenges due to complex backgrounds, item overlap, and multi-scale target detection. Traditional methods often struggle to accurately identify objects, especially under cluttered conditions. This paper presents an advanced detection model, called YOLOv8n-GEMA, which incorporates several enhancements to address these issues. Firstly, the generalized efficient layer aggregation network (GELAN) module is employed to augment the feature fusion capabilities. Secondly, to tackle the problems of overlap and occlusion in X-ray images, the efficient multi-scale attention (EMA) module is utilized, effectively managing the feature capture and interdependencies among overlapping items, thereby boosting the model’s detection capability in such scenarios. Lastly, addressing the diverse sizes of items in X-ray images, the Inner-CIoU loss function uses auxiliary bounding boxes at varying scale ratios for loss calculation, ensuring faster and more effective bounding box predictions. The enhanced YOLOv8 model was tested on the public datasets SIXRay, HiXray, CLCXray, and PIDray, where the improved model’s mean average precision (mAP) reached 94.4%, 82.0%, 88.9%, and 85.9%, respectively, showing improvements of 3.6%, 1.6%, 0.9%, and 3.4% over the original YOLOv8. These results demonstrate the effectiveness and universality of the proposed method. Compared to current mainstream X-ray images of dangerous goods detection models, this model significantly reduces the false detection rate of dangerous goods in X-ray security images and achieves substantial improvements in the detection of overlapping and multi-scale targets, realizing higher accuracy in dangerous goods detection.
Read full abstract