Occult scaphoid fractures on initial radiographs of an injury are a diagnostic challenge to physicians. Although artificial intelligence models based on the principles of deep convolutional neural networks (CNN) offer a potential method of detection, it is unknown how such models perform in the clinical setting. (1) Does CNN-assisted image interpretation improve interobserver agreement for scaphoid fractures? (2) What is the sensitivity and specificity of image interpretation performed with and without CNN assistance (as stratified by type: normal scaphoid, occult fracture, and apparent fracture)? (3) Does CNN assistance improve time to diagnosis and physician confidence level? This survey-based experiment presented 15 scaphoid radiographs (five normal, five apparent fractures, and five occult fractures) with and without CNN assistance to physicians in a variety of practice settings across the United States and Taiwan. Occult fractures were identified by follow-up CT scans or MRI. Participants met the following criteria: Postgraduate Year 3 or above resident physician in plastic surgery, orthopaedic surgery, or emergency medicine; hand fellows; and attending physicians. Among the 176 invited participants, 120 completed the survey and met the inclusion criteria. Of the participants, 31% (37 of 120) were fellowship-trained hand surgeons, 43% (52 of 120) were plastic surgeons, and 69% (83 of 120) were attending physicians. Most participants (73% [88 of 120]) worked in academic centers, whereas the remainder worked in large, urban private practice hospitals. Recruitment occurred between February 2022 and March 2022. Radiographs with CNN assistance were accompanied by predictions of fracture presence and gradient-weighted class activation mapping of the predicted fracture site. Sensitivity and specificity of the CNN-assisted physician diagnoses were calculated to assess diagnostic performance. We calculated interobserver agreement with the Gwet agreement coefficient (AC1). Physician diagnostic confidence was estimated using a self-assessment Likert scale, and the time to arrive at a diagnosis for each case was measured. Interobserver agreement among physicians for occult scaphoid radiographs was higher with CNN assistance than without (AC1 0.42 [95% CI 0.17 to 0.68] versus 0.06 [95% CI 0.00 to 0.17], respectively). No clinically relevant differences were observed in time to arrive at a diagnosis (18 ± 12 seconds versus 30 ± 27 seconds, mean difference 12 seconds [95% CI 6 to 17]; p < 0.001) or diagnostic confidence levels (7.2 ± 1.7 seconds versus 6.2 ± 1.6 seconds; mean difference 1 second [95% CI 0.5 to 1.3]; p < 0.001) for occult fractures. CNN assistance improves physician diagnostic sensitivity and specificity as well as interobserver agreement for the diagnosis of occult scaphoid fractures. The differences observed in diagnostic speed and confidence is likely not clinically relevant. Despite these improvements in clinical diagnoses of scaphoid fractures with the CNN, it is unknown whether development and implementation of such models is cost effective. Level II, diagnostic study.
Read full abstract