Abstract

Automated pavement crack image segmentation presents a significant challenge due to the difficulty in detecting slender cracks on complex pavement backgrounds, as well as the significant impact of lighting conditions. In this paper, we propose a novel approach for automated pavement crack detection using a multi-scale feature fusion network based on the Transformer architecture, leveraging an encoding-decoding structure. In the encoding phase, the Transformer is leveraged as a substitute for the convolution operation, which utilizes global modeling to enhance feature extraction capabilities and address long-distance dependence. Then, dilated convolution is employed to increase the receptive field of the feature map while maintaining resolution, thereby further improving context information acquisition. In the decoding phase, the linear layer is employed to adjust the length of feature sequence output by different encoder block, and the multi-scale feature map is obtained after dimension conversion. Detailed information of cracks can be restored by fusing multi-scale features, thereby improving the accuracy of crack detection. Our proposed method achieves an F1 score of 70.84% on the Crack500 dataset and 84.50% on the DeepCrack dataset, which are improvements of 1.42% and 2.07% over the state-of-the-art method, respectively. The experimental results show that the proposed method has higher detection accuracy, better generalization and better crack detection results can be obtained under both high and low brightness conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call