Assessing the Performance of a New Artificial Intelligence-Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study.

Niv Ben-Shabat,Tomer Weizman,Howard Amital,Ariel Sloma,David Kiderman

doi:10.2196/32507

Abstract

BackgroundDiagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, and reduce hospitalization costs. Despite the potential benefits, their utilization in clinical practice is limited, emphasizing the need for new and improved products.ObjectiveThe aim of this study was to conduct a preliminary analysis of the diagnostic performance of “Kahun,” a new artificial intelligence-driven diagnostic tool.MethodsDiagnostic performance was evaluated based on the program’s ability to “solve” clinical cases from the United States Medical Licensing Examination Step 2 Clinical Skills board exam simulations that were drawn from the case banks of 3 leading preparation companies. Each case included 3 expected differential diagnoses. The cases were entered into the Kahun platform by 3 blinded junior physicians. For each case, the presence and the rank of the correct diagnoses within the generated differential diagnoses list were recorded. Each diagnostic performance was measured in two ways: first, as diagnostic sensitivity, and second, as case-specific success rates that represent diagnostic comprehensiveness.ResultsThe study included 91 clinical cases with 78 different chief complaints and a mean number of 38 (SD 8) findings for each case. The total number of expected diagnoses was 272, of which 174 were different (some appeared more than once). Of the 272 expected diagnoses, 231 (87.5%; 95% CI 76-99) diagnoses were suggested within the top 20 listed diagnoses, 209 (76.8%; 95% CI 66-87) were suggested within the top 10, and 168 (61.8%; 95% CI 52-71) within the top 5. The median rank of correct diagnoses was 3 (IQR 2-6). Of the 91 expected diagnoses, 62 (68%; 95% CI 59-78) of the cases were suggested within the top 20 listed diagnoses, 44 (48%; 95% CI 38-59) within the top 10, and 24 (26%; 95% CI 17-35) within the top 5. Of the 91 expected diagnoses, in 87 (96%; 95% CI 91-100), at least 2 out of 3 of the cases’ expected diagnoses were suggested within the top 20 listed diagnoses; 78 (86%; 95% CI 79-93) were suggested within the top 10; and 61 (67%; 95% CI 57-77) within the top 5.ConclusionsThe diagnostic support tool evaluated in this study demonstrated good diagnostic accuracy and comprehensiveness; it also had the ability to manage a wide range of clinical findings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JMIR Medical Informatics	Publication Date: Nov 30, 2021
Citations: 6	License type: cc-by

R Discovery Prime

R Discovery Prime

Assessing the Performance of a New Artificial Intelligence-Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study.

Abstract

Talk to us

Similar Papers

More From: JMIR Medical Informatics

Lead the way for us

Similar Papers

United States Medical Licensure Examination step 1 scores and obstetrics-gynecology clerkship final examination
Thomas D Myles
Obstetrics & Gynecology | VOL. 94
Thomas D MylesThomas D Myles
17 Nov 1999
Obstetrics & Gynecology | VOL. 94

Evaluating Diagnostic Accuracy of a New Artificial-Intelligence Driven Diagnostic Support Tool
Niv Ben-Shabat ... David Kiderman
SSRN Electronic Journal | VOL. -
Niv Ben-Shabat, et. al.Niv Ben-Shabat ... David Kiderman
01 Jan 2020
SSRN Electronic Journal | VOL. -

An Institutional Review: Which Metrics Correlate With a Successful United States Medical Licensing Examination Step 1 Score?
William D Shepard ... Kathlyn K Powell
Journal of Oral and Maxillofacial Surgery | VOL. 78
William D Shepard, et. al.William D Shepard ... Kathlyn K Powell
21 Sep 2019
Journal of Oral and Maxillofacial Surgery | VOL. 78

Influence of curriculum type on student performance in the United States Medical Licensing Examination Step 1 and Step 2 exams: problem-based learning vs. lecture-based curriculum.
Cam Enarson ... Liza Cariaga-Lo
Medical Education | VOL. 35
Cam Enarson, et. al.Cam Enarson ... Liza Cariaga-Lo
04 Nov 2001
Medical Education | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing the Performance of a New Artificial Intelligence-Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study.

Abstract

Talk to us

Similar Papers

More From: JMIR Medical Informatics