Detecting cracks from optical images plays a crucial role in road maintenance but its good realisation has many challenges. Road cracks exhibit significant diversity and complexity in terms of shape, size and texture and road images may contain various types of noise and interference, such as lighting variations, shadows and different appearances, due to varying perspectives and scales. To address these challenges, a comprehensive road crack dataset called CRCrack has been constructed, which encompasses various crack characteristics. This study proposes a road crack segmentation network called CSegNet, which combines convolutional neural networks (CNNs) and transformers. The network adopts an encoder-decoder framework, namely DeepLabV3+. In the encoder, leveraging the transformers' flexibility in modelling long-term dependencies and the CNNs' ability to capture local contextual information through local receptive fields, weight sharing and spatial subsampling, a ResNeXt-Transformer (ResNeXTR) feature extraction module is designed as the backbone network to enhance the feature extraction capability for road crack images. To reduce the computational cost in the self-attention (SA) computation of the transformer, an average pooling layer is introduced to downsample the dimensions of the encoded features. In the decoder, to focus on the key information of road cracks under diverse environmental conditions and interferences, an efficient channel attention module (ECAM) and a spatial attention module (SAM) are combined to design an efficient convolutional block attention module (ECBAM) to optimise feature representation. Through comparative experiments on the CRCrack dataset, the results demonstrate that the proposed method outperforms classic networks such as U-Net and DeepLabV3+ in terms of intersection over union (IoU), Dice coefficient and area under the receiver operating characteristic (AUROC) curve evaluation metrics. It exhibits good adaptability to ground crack images from different sources, providing a basis for estimating the degree of road damage.
Read full abstract