Abstract

By leveraging deep learning methods, pavement crack detection can be more automatic, efficient, and accurate than manual inspection. To solve the problem of limited receptive field in pure CNN-based crack detection networks, we proposed an end-to-end detection network based on Swin-Transformer, called SwinCrack. SwinCrack can produce more accurate and continuous descriptions of pavement cracks by modeling long-range interactions and adaptive spatial aggregation compared to CNN-based detection models. Furthermore, to delineate crisp and accurate crack boundaries, we introduced convolution operations to Swin-Transformer for more local and detailed crack information. Convolutional Patch Embedding Layer (CPEL), Convolutional Swin-Transformer Block (CSTB), and Depth-convolution Forward Network (DFN) are proposed and embedded into SwinCrack to capture more spatial contexts. Also, Convolutional Attention Gated Skip Connection (CAGSC) is designed to suppress background interference in low-level features. Furthermore, five evaluation experiments on SwinCrack and an ablation study on the four proposed modules are performed. The attention maps of the SwinCrack are visualized to give a better insight into the contribution of each convolutional module embedded. Evaluation results show that SwinCrack gains OIS values of 0.781 to 0.849 and a maximum 4.4% improvement on OIS among the six public crack datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.