Performance comparison of retrieval-augmented generation and fine-tuned large language models for construction safety management knowledge retrieval

Jungwon Lee,Seungjun Ahn,Daeho Kim,Dongkyun Kim

doi:10.1016/j.autcon.2024.105846

Jungwon Lee, Seungjun Ahn + Show 2 more

https://doi.org/10.1016/j.autcon.2024.105846

Copy DOI

Export

Save

Cite

Journal: Automation in Construction

Publication Date: Nov 12, 2024

Abstract
Full-Text
Similar Papers

Abstract

Listen

Construction safety standards are in unstructured formats like text and images, complicating their effective use in daily tasks. This paper compares the performance of Retrieval-Augmented Generation (RAG) and fine-tuned Large Language Model (LLM) for the construction safety knowledge retrieval. The RAG model was created by integrating GPT-4 with a knowledge graph derived from construction safety guidelines, while the fine-tuned LLM was fine-tuned using a question-answering dataset derived from the same guidelines. These models' performance is tested through case studies, using accident synopses as a query to generate preventive measurements. The responses were assessed using metrics, including cosine similarity, Euclidean distance, BLEU, and ROUGE scores. It was found that both models outperformed GPT-4, with the RAG model improving by 21.5 % and the fine-tuned LLM by 26 %. The findings highlight the relative strengths and weaknesses of the RAG and fine-tuned LLM approaches in terms of applicability and reliability for safety management.

Full Text