Obtaining spatial and topological information for cracks in construction materials is important for the evaluation of service performance in infrastructure engineering. Manually extracting crack information from an image is a tedious process. Recently, popular deep learning-based semantic segmentation technologies have been employed to alleviate this problem. However, existing semantic segmentation methods for crack detection are fully supervised, i.e., these methods require manual annotation of data to obtain pixel-level labels for training, which is time-consuming. To solve this problem, this paper proposes a patch-based weakly supervised semantic segmentation network for crack detection. The proposed method uses image-level annotation as the supervision condition and fully considers the local similarity of the crack topology in the image. The use of patches cropped from the image as the input in this method can reduce the image complexity significantly without losing the spatial location information of the crack. A discriminative localization technique is used to extract rough location information of the crack from a trained classification network, which is then refined by a conditional random field to obtain a synthetic label. These synthetic labels can replace the manually annotated pixel-level labels for the training of the segmentation network. Thereafter, a neighborhood fusion strategy is used to merge the patches into the final output. Two datasets are employed to train and evaluate the proposed method. The results indicate that this method can achieve a performance comparable to that of fully supervised methods (an MIoU of 0.821 and F-score of 0.8 were obtained for the fully supervised method compared with an MIoU of 0.782 and F-score of 0.741 with the proposed method), while reducing the annotation workload by approximately 80%.
Read full abstract