Two-stage framework with improved U-Net based on self-supervised contrastive learning for pavement crack segmentation

Qingsong Song,Wei Yao,Haojiang Tian,Yidan Guo,Ravie Chandren Muniyandi,Yisheng An

doi:10.1016/j.eswa.2023.122406

Abstract

After the deep learning method emerged, the automated detection technology of pavement crack images has significantly progressed. The dominant approach is supervised deep learning, which relies on large-scale labeled ground truth. However, the problems are mostly unlabeled original crack images, which are difficult to fully utilize by the supervised deep learning network model. As a representative method of self-supervised learning, contrast learning can learn feature representations from unlabeled data, thus improving the accuracy of downstream tasks. This paper proposes a two-stage framework with improved U-Net based on self-supervised contrastive learning for pavement crack image segmentation. The framework takes improved U-Net as the basic architecture to highlight the significant features of the target segment of fine cracks. U-Net is improved by integrating the residual structure and attention mechanism in the typical U-Net architecture. The framework includes two learning stages: pre-training and fine-tuning. In the pre-training stage, the potential feature representation is learned from the unlabeled crack image. Crack images and pavement background images are used in the training data so that the model learns the distinguishable mapping relationship between crack and its background in the high-dimensional vector space without supervision comparison. In the fine-tuning stage, the network loads the parameters after the pre-training and uses the labeled training data for the retraining. Experimental results show that the proposed two-stage framework significantly improves the performance of crack segmentation accuracy without increasing the number of existing training samples and their labeling.

Full Text