In recent years, deep learning-based crack detection methods have been widely explored and applied due to their high versatility and adaptability. In civil engineering applications, recent research on crack detection through deep convolutional neural network (DCNN) includes road pavement crack detection, bridge inspection, defects detection in shield tunnel lining, etc. Despite the increasing popularity of DCNN on crack detection, many challenges have yet to be properly addressed. For crack detection using three-dimensional (3D) range (i.e., elevation) images, disturbances such as surface variation can negatively affect the detection performance. Besides, some typical non-crack patterns such as grooves can be easily misidentified as cracks, i.e., false positives. Another issue lies in the selection of hyperparameters related with the design of a DCNN architecture. For example, the hyperparameters which are related with network structure (e.g., kernel size, network depth and width) and training (e.g., mini-batch size and learning rate) can impact the network performance to a significant extent. Therefore, they need to be properly determined for optimal performance. However, for deep learning-based roadway crack classification using laser-scanned range images, a comprehensive discussion on the hyperparameter selection/tuning has not been thoroughly performed. This study develops a hyperparameter selection process involving a series of experiments on laser-scanned range images with high diversities, investigating the optimal joint hyperparameter configuration on network structure and training for DCNN-based roadway crack classification. In a comparative study, 36 DCNN architectures with varying layouts are developed for crack classification. These architecture candidates differ in kernel sizes (e.g., 3 × 3, 7 × 7, and 11 × 11), network depths (from 5 to 8 weight layers), and widths (from 16 to 96 kernels in each convolutional layer). The 7-layer DCNN with constant 7 × 7 kernels and increasing network widths yields the highest classification performance among the proposed 36 DCNN classifiers, which may be because it can best reflect the complexity of the acquired laser-scanned roadway range images. Once the optimal architecture layout is determined, further discussion on the selection of min-batch sizes, learning rates, dropout factor and leaky rectified linear unit (LReLU) factor is performed. Experimental results show the optimal architecture with associated training configuration can achieve consistent and accurate performance, under the contamination of surface variations and grooved patterns in laser-scanned range images. Discussion on the hyperparameter selection can provide insights for the development of DCNN in similar applications using laser-scanned range images.
Read full abstract