Bridge photographs contain significant technical information, such as damaged structural parts and types of damage, yet interpreting these details is not always straightforward. Despite the advancements in image analysis for bridge inspection, there remains a significant gap in converting these images into comprehensible explanatory texts that can be readily used by less experienced engineers and administrative staff for effective maintenance decision-making. In this study, we developed a model that generates explanatory texts from bridge images based on a deep learning model, and we also developed a web system that can be utilized during bridge inspections. The proposed method enables the provision of user-friendly, text-based explanations of bridge damage within images, allowing relatively inexperienced engineers and administrative staff without extensive technical expertise to understand the representation of bridge damage in text form. Additionally, we have developed a system that continually trains and improves its performance by accumulating data as users interact with it. This paper describes the image captioning technique for generating explanatory texts and the structure of the web system.