AbstractA dual attention deep learning network is developed to classify three types of steel defects, locate their positions, and depict their shapes on the steel surface in an automatic and accurate manner. The novel pixel‐level detection algorithm called DAN‐DeepLabv3+ integrates a dual attention module into the DeepLabv3+ framework in pursue of more precise segmentation results. For one thing, the dual parallel attention module helps to explicitly model rich contextual dependencies over local feature representations in the spatial and channel dimensions. For another, the popular DeepLabv3+ in an encoder‐decoder architecture is useful in capturing multi‐scale contextual information and sharp object boundaries. The DAN‐DeepLabv3+ is applied to an available dataset containing 6666 images, where three types of steel defects are taken by high‐frequency cameras and have been annotated manually. Experimental results show that, compared with other deep learning models, DAN‐DeepLabv3+ based on the Xception backbone exhibits the best segmentation performance under the mean intersection over union (IoU) of 89.95% and the frequency‐weighted IoU of 97.34%. Besides, the F1‐score for the three kinds of defects can reach 86.90%, 99.20%, and 92.81%. From the comparative study, it has been found that the adoption of the dual attention module and DeepLabv3+ contributes to boosting the segmentation performance. The significance of the proposed hybrid model lies in the enhancement in accurately detecting single or multiple steel defects, which has proven to outperform other classical methods.
Read full abstract