Utilizing a domain-specific large language model for LI-RADS v2018 categorization of free-text MRI reports: a feasibility study

Mario Matute-González,Anna Darnell,Marc Comas-Cufí,Javier Pazó,Alexandre Soler,Belén Saborido,Ezequiel Mauro,Juan Turnes,Alejandro Forner,María Reig,Jordi Rimola

doi:10.1186/s13244-024-01850-1

Abstract

ObjectiveTo develop a domain-specific large language model (LLM) for LI-RADS v2018 categorization of hepatic observations based on free-text descriptions extracted from MRI reports.Material and methodsThis retrospective study included 291 small liver observations, divided into training (n = 141), validation (n = 30), and test (n = 120) datasets. Of these, 120 were fictitious, and 171 were extracted from 175 MRI reports from a single institution. The algorithm’s performance was compared to two independent radiologists and one hepatologist in a human replacement scenario, and considering two combined strategies (double reading with arbitration and triage). Agreement on LI-RADS category and dichotomic malignancy (LR-4, LR-5, and LR-M) were estimated using linear-weighted κ statistics and Cohen’s κ, respectively. Sensitivity and specificity for LR-5 were calculated. The consensus agreement of three other radiologists served as the ground truth.ResultsThe model showed moderate agreement against the ground truth for both LI-RADS categorization (κ = 0.54 [95% CI: 0.42–0.65]) and the dichotomized approach (κ = 0.58 [95% CI: 0.42–0.73]). Sensitivity and specificity for LR-5 were 0.76 (95% CI: 0.69–0.86) and 0.96 (95% CI: 0.91–1.00), respectively. When the chatbot was used as a triage tool, performance improved for LI-RADS categorization (κ = 0.86/0.87 for the two independent radiologists and κ = 0.76 for the hepatologist), dichotomized malignancy (κ = 0.94/0.91 and κ = 0.87) and LR-5 identification (1.00/0.98 and 0.85 sensitivity, 0.96/0.92 and 0.92 specificity), with no statistical significance compared to the human readers’ individual performance. Through this strategy, the workload decreased by 45%.ConclusionLI-RADS v2018 categorization from unlabelled MRI reports is feasible using our LLM, and it enhances the efficiency of data curation.Critical relevance statementOur proof-of-concept study provides novel insights into the potential applications of LLMs, offering a real-world example of how these tools could be integrated into a local workflow to optimize data curation for research purposes.Key PointsAutomatic LI-RADS categorization from free-text reports would be beneficial to workflow and data mining.LiverAI, a GPT-4-based model, supported various strategies improving data curation efficiency by up to 60%.LLMs can integrate into workflows, significantly reducing radiologists’ workload.Graphical

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Utilizing a domain-specific large language model for LI-RADS v2018 categorization of free-text MRI reports: a feasibility study

Abstract

Talk to us

Similar Papers

More From: Insights into Imaging

Lead the way for us

Journal: Insights into Imaging	Publication Date: Nov 22, 2024
License type: cc-by

Similar Papers

Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study.
Lisa C Adams ... Keno K Bressem
Radiology | VOL. 307
Lisa C Adams, et. al.Lisa C Adams ... Keno K Bressem
04 Apr 2023
Radiology | VOL. 307

PreparedLLM: effective pre-pretraining framework for domain-specific large language models
Zhou Chen ... Yuqi Bai
Big Earth Data | VOL. ahead-of-print
Zhou Chen, et. al.Zhou Chen ... Yuqi Bai
15 Sep 2024
Big Earth Data | VOL. ahead-of-print

Impact of a Structured Report Template on the Quality of CT and MRI Reports for Hepatocellular Carcinoma Diagnosis
Milana Flusberg ... Victoria Chernyak
Journal of the American College of Radiology | VOL. 14
Milana Flusberg, et. al.Milana Flusberg ... Victoria Chernyak
06 May 2017
Journal of the American College of Radiology | VOL. 14

A self-supervised language model selection strategy for biomedical question answering
Negar Arabzadeh ... Ebrahim Bagheri
Journal of Biomedical Informatics | VOL. 146
Negar Arabzadeh, et. al.Negar Arabzadeh ... Ebrahim Bagheri
16 Sep 2023
Journal of Biomedical Informatics | VOL. 146

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Utilizing a domain-specific large language model for LI-RADS v2018 categorization of free-text MRI reports: a feasibility study

Abstract

Talk to us

Similar Papers

More From: Insights into Imaging