Improving visual question answering for bridge inspection by pre‐training with external data of image–text pairs

Thannarot Kunlamai,Masanori Suganuma,Takayaki Okatani,Tatsuro Yamane,Pang‐Jo Chun

doi:10.1111/mice.13086

Thannarot Kunlamai, Masanori Suganuma + Show 3 more

Open Access

https://doi.org/10.1111/mice.13086

Copy DOI

Abstract

AbstractThis paper explores the application of visual question answering (VQA) in bridge inspection using recent advancements in multimodal artificial intelligence (AI) systems. VQA involves an AI model providing natural language answers to questions about the content of an input image. However, applying VQA to bridge inspection poses challenges due to the high cost of creating training data that requires expert knowledge. To address this, we propose leveraging existing bridge inspection reports, which already include image–text pairs, as external knowledge to enhance VQA performance. Our approach involves training the model on a large collection of image–text pairs, followed by fine‐tuning it on a limited amount of training data specifically designed for the VQA task. The results demonstrate a significant improvement in VQA accuracy using this approach. These findings highlight the potential of AI models for VQA as valuable tools for assessing the condition of bridges.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer-Aided Civil and Infrastructure Engineering	Publication Date: Aug 18, 2023
Citations: 6	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Improving visual question answering for bridge inspection by pre‐training with external data of image–text pairs

Abstract

Talk to us

Similar Papers

More From: Computer-Aided Civil and Infrastructure Engineering

Lead the way for us

Similar Papers

Predictive modeling in reproductive medicine: Where will the future of artificial intelligence research take us?
Carol Lynn Curchoe ... Zev Rosenwaks
Fertility and Sterility | VOL. 114
Carol Lynn Curchoe, et. al.Carol Lynn Curchoe ... Zev Rosenwaks
01 Nov 2020
Fertility and Sterility | VOL. 114

Accuracy vs. complexity: A trade-off in visual question answering models
Moshiur Farazi ... Nick Barnes
Pattern Recognition | VOL. 120
Moshiur Farazi, et. al.Moshiur Farazi ... Nick Barnes
12 Jun 2021
Pattern Recognition | VOL. 120

Improving Automatic VQA Evaluation Using Large Language Models
Oscar Mañas ... Aishwarya Agrawal
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Oscar Mañas, et. al.Oscar Mañas ... Aishwarya Agrawal
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

Neural Networks for Detecting Irrelevant Questions During Visual Question Answering
Mengdi Li ... Cornelius Weber
-
Mengdi Li, et. al.Mengdi Li ... Cornelius Weber
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving visual question answering for bridge inspection by pre‐training with external data of image–text pairs

Abstract

Talk to us

Similar Papers

More From: Computer-Aided Civil and Infrastructure Engineering