A deep learning‐based image captioning method to automatically generate comprehensive explanations of bridge damage

Pang‐Jo Chun,Yu Maemura,Tatsuro Yamane

doi:10.1111/mice.12793

Pang‐Jo Chun, Yu Maemura + Show 1 more

Open Access

https://doi.org/10.1111/mice.12793

Copy DOI

Abstract

AbstractPhotographs of bridges can reveal considerable technical information such as the part of the structure that is damaged and the type of damage. Maintenance and inspection engineers can benefit greatly from a technology that can automatically extract and express such information in readable sentences. This is possibly the first study on developing a deep learning model that can generate sentences describing the damage condition of a bridge from images through an image captioning method. Our study shows that by introducing an attention mechanism into the deep learning model, highly accurate descriptive sentences can be generated. In addition, often multiple forms of damage can be observed in the images of bridges; hence, our algorithm is adapted to output multiple sentences to provide a comprehensive interpretation of complex images. In our dataset, the scores of Bilingual Evaluation Understudy (BLEU)‐1 to BLEU‐4 were 0.782, 0.749, 0.711, and 0.693, respectively, and the percentage of correctly output explanatory sentences is 69.3%. All of these results are better than the model without the attention mechanism. The developed method makes it possible to provide user‐friendly, text‐based explanations of bridge damage in images, making it easier for engineers with relatively little experience and even administrative staff without extensive technical expertise to understand images of bridge damage. Future research in this field is expected to lead to the unification of field expertise with artificial intelligence (AI), which will be the foundation of the evolutionary development of bridge inspection AI.

Full Text