Can Artificial Intelligence Mitigate Missed Diagnoses by Generating Differential Diagnoses for Neurosurgeons?

Rohit Prem Kumar,Syed A Sarwar,Paulo Lobo,Nicholas D Cassimatis,Hanin Bachir,Vijay Sivan,Francis Ruzicka,Geoffrey R O’Malley,Nitesh V Patel,Ilona Cazorla Morales,Jasdeep S Hundal

doi:10.1016/j.wneu.2024.05.052

Abstract

Neurosurgery emphasizes the criticality of accurate differential diagnoses, with diagnostic delays posing significant health and economic challenges. As large language models (LLMs) emerge as transformative tools in healthcare, this study seeks to elucidate their role in assisting neurosurgeons with the differential diagnosis process, especially during preliminary consultations. This study employed three chat-based LLMs, ChatGPT (versions 3.5 and 4.0), Perplexity AI, and Bard AI, to evaluate their diagnostic accuracy. Each LLM was prompted using clinical vignettes, and their responses were recorded to generate differential diagnoses for 20 common and uncommon neurosurgical disorders. Disease-specific prompts were crafted using Dynamed, a clinical reference tool. The accuracy of the LLMs was determined based on their ability to identify the target disease within their top differential diagnoses correctly. For the initial differential, ChatGPT 3.5 achieved an accuracy of 52.63%, while ChatGPT 4.0 performed slightly better at 53.68%. Perplexity AI and Bard AI demonstrated 40.00% and 29.47% accuracy, respectively. As the number of considered differentials increased from two to five, ChatGPT 3.5 reached its peak accuracy of 77.89% for the top five differentials. Bard AI and Perplexity AI had varied performances, with Bard AI improving in the top five differentials at 62.11%. On a disease-specific note, the LLMs excelled in diagnosing conditions like epilepsy and cervical spine stenosis but faced challenges with more complex diseases such as Moyamoya disease and ALS. LLMs showcase the potential to enhance diagnostic accuracy and decrease the incidence of missed diagnoses in neurosurgery.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Can Artificial Intelligence Mitigate Missed Diagnoses by Generating Differential Diagnoses for Neurosurgeons?

Abstract

Talk to us

Similar Papers

More From: World Neurosurgery

Lead the way for us

Similar Papers

The Diagnostic Performance of Large Language Models and General Radiologists in Thoracic Radiology Cases: A Comparative Study.
Yasin Celal Gunes ... Turay Cesur
Journal of thoracic imaging | VOL. -
Yasin Celal Gunes, et. al.Yasin Celal Gunes ... Turay Cesur
13 Sep 2024
Journal of thoracic imaging | VOL. -

Large Language Model Influence on Diagnostic Reasoning
Ethan Goh ... Jonathan H Chen
JAMA Network Open | VOL. 7
Ethan Goh, et. al.Ethan Goh ... Jonathan H Chen
28 Oct 2024
JAMA Network Open | VOL. 7

Generative AI enhanced with NCCN clinical practice guidelines for clinical decision support: A case study on bone cancer.
Yanshan Wang ... Xizhi Wu
Journal of Clinical Oncology | VOL. 42
Yanshan Wang, et. al.Yanshan Wang ... Xizhi Wu
01 Jun 2024
Journal of Clinical Oncology | VOL. 42

Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.
Arya Rao ... Marc D Succi
Journal of medical Internet research | VOL. 25
Arya Rao, et. al.Arya Rao ... Marc D Succi
22 Aug 2023
Journal of medical Internet research | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Can Artificial Intelligence Mitigate Missed Diagnoses by Generating Differential Diagnoses for Neurosurgeons?

Abstract

Talk to us

Similar Papers

More From: World Neurosurgery