Abstract

Currently, there is an urgent need to utilize automatic approaches to detecting pavement cracks for roadway maintenance. Taking advantage of the development of convolutional neural networks (CNNs), previous studies put more effort into the detection of the pavement crack with local feature extraction using consecutive convolutional operations. However, it results in the loss of detailed information, making CNNs fail to accurately inspect the long and complicated cracks under noisy conditions, which are common on the pavement surface, negatively impacting detection accuracy. In order to cope with this issue, this study proposes a Transformer-based semantic segmentation network that unifies the Swin Transformer as the Encoder and the UperNet with the attention module as the Decoder for robust and accurate pixel-level pavement crack detection. Leveraging the hierarchical architecture of Swin Transformer, the global and long-range semantic features of the pavement crack are learned for improved segmentation accuracy. With the assistance of the attention module, the Decoder can retrieve more details of the crack information, presenting accurate detection results on the fine and tiny pavement cracks. To validate the superiority of the proposed network, we have trained and tested six semantic segmentation models on three public pavement crack datasets. Compared to other models, the proposed model achieves the best performance on visualization and evaluation metrics of mean F1(mF1) and mean Recall (mRecall) with 0-pixel tolerance. It paves the way for future applications of automatic pavement crack detection using Transformed-based networks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call