Evaluation of ChatGPT-Generated Differential Diagnosis for Common Diseases With Atypical Presentation: Descriptive Research.

Kiyoshi Shikino,Yu Yamamoto,Hiroki Tamura,Fumina Orihara,Morika Suzuki,Teiko Kawahigashi,Yuki Otsuka,Toru Morikawa,Sayaka Aoyama,Takashi Watari,Shintaro Kosaka,Tomohiro Matsumoto,Yosuke Sasaki,Fumio Otsuka,Yoji Hoshina,Yasuharu Tokuda,Kotaro Kunitomo,Takahashi Hiromizu,Takuma Saito,Toshinori Nishizawa,Satoshi Watanuki,Gemmei Iizuka,Taro Shimizu,Masaki Tago,Koichi Nakashima,Yuichiro Matsuo,Midori Tokushima,Hirofumi Kimura,Yuto Unoki

doi:10.2196/58758

Abstract

The persistence of diagnostic errors, despite advances in medical knowledge and diagnostics, highlights the importance of understanding atypical disease presentations and their contribution to mortality and morbidity. Artificial intelligence (AI), particularly generative pre-trained transformers like GPT-4, holds promise for improving diagnostic accuracy, but requires further exploration in handling atypical presentations. This study aimed to assess the diagnostic accuracy of ChatGPT in generating differential diagnoses for atypical presentations of common diseases, with a focus on the model's reliance on patient history during the diagnostic process. We used 25 clinical vignettes from the Journal of Generalist Medicine characterizing atypical manifestations of common diseases. Two general medicine physicians categorized the cases based on atypicality. ChatGPT was then used to generate differential diagnoses based on the clinical information provided. The concordance between AI-generated and final diagnoses was measured, with a focus on the top-ranked disease (top 1) and the top 5 differential diagnoses (top 5). ChatGPT's diagnostic accuracy decreased with an increase in atypical presentation. For category 1 (C1) cases, the concordance rates were 17% (n=1) for the top 1 and 67% (n=4) for the top 5. Categories 3 (C3) and 4 (C4) showed a 0% concordance for top 1 and markedly lower rates for the top 5, indicating difficulties in handling highly atypical cases. The χ2 test revealed no significant difference in the top 1 differential diagnosis accuracy between less atypical (C1+C2) and more atypical (C3+C4) groups (χ²1=2.07; n=25; P=.13). However, a significant difference was found in the top 5 analyses, with less atypical cases showing higher accuracy (χ²1=4.01; n=25; P=.048). ChatGPT-4 demonstrates potential as an auxiliary tool for diagnosing typical and mildly atypical presentations of common diseases. However, its performance declines with greater atypicality. The study findings underscore the need for AI systems to encompass a broader range of linguistic capabilities, cultural understanding, and diverse clinical scenarios to improve diagnostic utility in real-world settings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of ChatGPT-Generated Differential Diagnosis for Common Diseases With Atypical Presentation: Descriptive Research.

Abstract

Talk to us

Similar Papers

More From: JMIR medical education

Lead the way for us

Journal: JMIR medical education	Publication Date: Jun 21, 2024
License type: cc-by

Similar Papers

Longitudinal Changes in Diagnostic Accuracy of a Differential Diagnosis List Developed by an AI-Based Symptom Checker: Retrospective Observational Study.
Yukinori Harada ... Tetsu Sakamoto
JMIR formative research | VOL. 8
Yukinori Harada, et. al.Yukinori Harada ... Tetsu Sakamoto
17 May 2024
JMIR formative research | VOL. 8

Long-term outcomes of ADEM-like and tumefactive presentations of CNS demyelination: a case-comparison analysis
Simon V Arnett ... Simon A Broadley
Journal of Neurology | VOL. 271
Simon V Arnett, et. al.Simon V Arnett ... Simon A Broadley
11 Jun 2024
Journal of Neurology | VOL. 271

77. Long Term Care Facility Residents Hospitalized with COVID-19 Infection Present with Atypical Symptoms
Aurora E Pop-Vicas ... Nasia Safdar
Open Forum Infectious Diseases | VOL. 7
Aurora E Pop-Vicas, et. al.Aurora E Pop-Vicas ... Nasia Safdar
31 Dec 2020
77. Long Term Care Facility Residents Hospitalized with COVID-19 Infection Present with Atypical Symptoms
Aurora E Pop-Vicas ... Nasia Safdar

Herpesviruses and the microbiome
David H Dreyfus
Journal of Allergy and Clinical Immunology | VOL. 132
David H DreyfusDavid H Dreyfus
20 Apr 2013
Journal of Allergy and Clinical Immunology | VOL. 132

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of ChatGPT-Generated Differential Diagnosis for Common Diseases With Atypical Presentation: Descriptive Research.

Abstract

Talk to us

Similar Papers

More From: JMIR medical education