Foreground-Background Classification for Crack Detection

Fereshteh Nayyeri

doi:10.25904/1912/4008

Abstract

For health and safety monitoring in civil constructions such as bridges, roads and pavements, segmenting the regions of interest is the fundamental requirement for image analysis at high-level semantic. One of the major structural problems in concrete and asphalt surfaces are cracks, which start with harming the visual aspect of the construction and further lead to failure of the construction. This study makes four contributions to extract the image foreground from the background in order to address the crack detection task on the asphalt and concrete surfaces. Specifically, we model cracks as foreground objects and concrete or asphalt surfaces as textured background. The first contribution of this research is a hybrid image processing method for crack detection. In this method, the cracks are modelled as linear structures on the background of textured concrete or asphalt surfaces, which can be extracted by combining structure extraction with global pattern distribution. There are two phases in this model towards creating the final structure-texture map. The first phase is extracting strong structures or edges using relative total variation measures, which produces a structure feature map by preserving the edges and suppressing the background noises. The second phase calculates the spatial distribution of textures across the image. A bag-of-words model is used in this phase to quantise the texture pattern, which in crack detection application is the widely distributed road texture background. The local structure map and the global distribution map extract the crack structure and the textured background, respectively. The final crack is extracted by fusing these two maps and applying binarisation as the post-processing step. This model achieved a better result compared with the local structure extraction and the saliency method. As the second contribution, a large-scale dataset of asphalt and concrete crack images is prepared, including images with their corresponding high-resolution pixel-wise labels. To the best of our knowledge, until the completion of this thesis, there has not been any crack image dataset with pixel-wise ground truth labels available. The original crack image set includes 2532 images of cracks on brick and asphalt surfaces. This image set is split into training, validation and testing sets with the ratio of 50/25/25. Two augmentation techniques of rotation and ipping are applied to only the training set while validation and testing sets are locked in order to prevent the data leakage. All models are learned on the training set after fine-tuning the hyper-parameters. After each tuning, an early estimate of the model accuracy is obtained using the validation set. Finally, an unbiased performance estimation of the fitted model is provided on the testing set. The third contribution of this thesis is developing two encoder-decoder networks by exploring the recent advances of deep learning research for crack detection and applying them on our crack dataset. The first network is inspired by DeepLab, which is a modified ResNet architecture. In this network the last pooling layer is replaced with an Atrous Spatial Pyramid Pooling (ASPP) module. This encoder-decoder structure is designed to classify each image pixel into two foreground cracks or textured background. The second model is inspired by Full Resolution Residual Network (FRRN), which is a ResNetlike network with two residual and pooling streams to extract the high- and low-level features, respectively. The combination of di erent level features in this model improves the localization of crack pixels as well as the recognition of the crack structure as a whole. As FRRN outperforms DeepLab on crack classification, we select it as the baseline for further research. In our last contribution, we optimise the FRRN model by reducing the number of parameters. Inspired by the Inception module which significantly improved the utilization of the computing resources inside the GoogLeNet, we proposed Incepted FRRN (I-FRRN) network by embedding the Inception module inside the FRRN. Combining these two structures, our proposed model records 88.14% accuracy in classifying the positive class with 0.22% improvement, while having less than half the number of parameters compared with FRRN. The results show that the proposed architecture achieves significant computational e ciency gains and comparable or higher class-accuracy in crack classification task over the baseline model.

Full Text