Abstract

Deep learning-based computer vision algorithms, especially image segmentation, have been successfully applied to pixel-level crack detection. The prediction accuracy relies heavily on detecting the performance of fine-grained cracks and removing crack-like noise. We propose a fast encoder-decoder network with scaling attention. We focus on a low-level feature map by minimizing encoder-decoder pairs and adopting an Atrous Spatial Pyramid Pooling (ASPP) layer to improve the detection accuracy of tiny cracks. Another challenge is the reduction in crack-like noise. This introduces a novel scaling attention, AG+, to suppress irrelevant regions. However, removing crack-like noise, such as grooving, is difficult by using only improved segmentation networks. In this study, a crack dataset is generated. It contains 11,226 sets of images and masks, which are effective for detecting detailed tiny cracks and removing non-semantic objects. Our model is evaluated on the generated dataset and compared with state-of-the-art segmentation networks. We use the mean Dice coefficient (mDice) and mean Intersection over union (mIoU) to compare the performance and FLOPs for computational complexity. The experimental results show that our model improves the detection accuracy of fine-grained cracks and reduces the computational cost dramatically. The mDice score of the proposed model is close to the best score, with only a 1.2% difference but two times fewer FLOPs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call