The inspection of visual quality represents a crucial step in the production process of bottled products. Numerous machine vision methodologies have demonstrated proficient identification of significant defects on bottle surfaces in well-controlled imaging environments. However, the actual production scenario introduces a myriad of surface defect types on bottled products, exhibiting diverse shapes, with the majority of defect instances characterized by relatively small individual areas. Faced with such diverse and small-sized defects, traditional convolutional neural networks (CNNs) encounter limitations due to the utilization of fixed-size convolutional kernels. This choice results in a constrained receptive field, hindering the capture of sufficiently effective contextual information. Additionally, pooling operations within traditional CNNs lead to a substantial reduction in feature map dimensions, causing excessive blurring or outright neglect of small-sized defects. Consequently, this detrimentally impacts the accuracy of surface defect detection and recognition. This study proposes a novel multi-scale defect detection model incorporating a variable receptive field and Gather–Distribute feature fusion mechanism to overcome limitations of traditional CNNs. Nine different scale defect images of bottled products, including worn, wrinkles, joint markings and etc, were collected and enhanced to construct a surface defect detection dataset for the actual production of bottled products. Enhancements to convolutional layers and the C2f module improve feature extraction for different defect shapes and sizes. Integration of the Gather–Distribute feature fusion module (GDFFM) reduces feature information loss and enhances shallow-layer feature utilization. An efficient detection head (E-Detect) with ”parameter sharing” is proposed to reduce computational complexity and improve detection speed without experiencing significant accuracy loss. Experimental results demonstrate that the model’s superiority over advanced defect detection algorithms across various categories. Notably, it achieves accuracies of 89.9%, 55.6%, 89.4%, and 75% for challenging defects like small worn, subtle wrinkles, inclined external cover, and misaligned labels, with improvements of 3.2%, 7%, 2%, and 11.8% over the baseline model, respectively. The model’s mean Average Precision (mAP) also increases by 2.7%.
Read full abstract