Abstract

The technique for target detection based on a convolutional neural network has been widely implemented in the industry. However, the detection accuracy of X-ray images in security screening scenarios still requires improvement. This paper proposes a coupled multi-scale feature extraction and multi-scale attention architecture. We integrate this architecture into the Single Shot MultiBox Detector (SSD) algorithm and find that it can significantly improve the effectiveness of target detection. Firstly, ResNet is used as the backbone network to replace the original VGG network to improve the feature extraction capability of the convolutional neural network for images. Secondly, a multi-scale feature extraction (MSE) structure is designed to enrich the information contained in the multi-stage prediction feature layer. Finally, the multi-scale attention architecture (MSA) is fused onto the prediction feature layer to eliminate the redundant features’ interference and extract effective contextual information. In addition, a combination of Adaptive-NMS and Soft-NMS is used to output the final prediction anchor boxes when performing non-maximum suppression. The results of the experiments show that the improved method improves the mean average precision (mAP) value by 7.4% compared to the original approach. New modules make detection much more accurate while keeping the detection speed the same.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call