Optimizing GPT-4 Turbo Diagnostic Accuracy in Neuroradiology through Prompt Engineering and Confidence Thresholds.

Akihiko Wada,George Shih,Yayoi Hayakawa,Keigo Shimoji,Shigeki Aoki,Katsuhiro Sano,Junko Kikuta,Mitsuo Nishizawa,Akifumi Hagiwara,Atsushi Nakanishi,Toshiaki Akashi,Koji Kamagata

doi:10.3390/diagnostics14141541

Abstract

Integrating large language models (LLMs) such as GPT-4 Turbo into diagnostic imaging faces a significant challenge, with current misdiagnosis rates ranging from 30-50%. This study evaluates how prompt engineering and confidence thresholds can improve diagnostic accuracy in neuroradiology. We analyze 751 neuroradiology cases from the American Journal of Neuroradiology using GPT-4 Turbo with customized prompts to improve diagnostic precision. Initially, GPT-4 Turbo achieved a baseline diagnostic accuracy of 55.1%. By reformatting responses to list five diagnostic candidates and applying a 90% confidence threshold, the highest precision of the diagnosis increased to 72.9%, with the candidate list providing the correct diagnosis at 85.9%, reducing the misdiagnosis rate to 14.1%. However, this threshold reduced the number of cases that responded. Strategic prompt engineering and high confidence thresholds significantly reduce misdiagnoses and improve the precision of the LLM diagnostic in neuroradiology. More research is needed to optimize these approaches for broader clinical implementation, balancing accuracy and utility.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Diagnostics (Basel, Switzerland)	Publication Date: Jul 17, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Optimizing GPT-4 Turbo Diagnostic Accuracy in Neuroradiology through Prompt Engineering and Confidence Thresholds.

Abstract

Talk to us

Similar Papers

More From: Diagnostics (Basel, Switzerland)

Lead the way for us

Similar Papers

Artificial Intelligence (AI) in Radiology: A Deep Dive Into ChatGPT 4.0's Accuracy with the American Journal of Neuroradiology's (AJNR) "Case of the Month".
Pokhraj P Suthar ... Avin Kounsal
Cureus | VOL. 15
Pokhraj P Suthar, et. al.Pokhraj P Suthar ... Avin Kounsal
23 Aug 2023
Cureus | VOL. 15

On the theory of the indicator-dilution method for measurement of blood flow and volume.
Paul Meier ... Kenneth L Zierler
Journal of Applied Physiology | VOL. 6
Paul Meier, et. al.Paul Meier ... Kenneth L Zierler
01 Jun 1954
Journal of Applied Physiology | VOL. 6

Screening for self-plagiarism in a subspecialty-versus-general imaging journal using iThenticate.
A U Kalnins ... M Castillo
AJNR. American journal of neuroradiology | VOL. 36
A U Kalnins, et. al.A U Kalnins ... M Castillo
29 Jan 2015
AJNR. American journal of neuroradiology | VOL. 36

Bibliometric Analysis of Manuscript Title Characteristics Associated With Higher Citation Numbers: A Comparison of Three Major Radiology Journals, AJNR, AJR, and Radiology
Falgun H Chokshi ... Mauricio Castillo
Current Problems in Diagnostic Radiology | VOL. 45
Falgun H Chokshi, et. al.Falgun H Chokshi ... Mauricio Castillo
24 Mar 2016
Current Problems in Diagnostic Radiology | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimizing GPT-4 Turbo Diagnostic Accuracy in Neuroradiology through Prompt Engineering and Confidence Thresholds.

Abstract

Talk to us

Similar Papers

More From: Diagnostics (Basel, Switzerland)