Evaluating the Robustness of a Deep Learning Bone Age Algorithm to Clinical Image Variation Using Computational Stress Testing.

Samantha M Santomartino,Kristin Putman,Paul H Yi,Elham Beheshtian,Vishwa S Parekh

doi:10.1148/ryai.230240

Abstract

Purpose To evaluate the robustness of an award-winning bone age deep learning (DL) model to extensive variations in image appearance. Materials and Methods In December 2021, the DL bone age model that won the 2017 RSNA Pediatric Bone Age Challenge was retrospectively evaluated using the RSNA validation set (1425 pediatric hand radiographs; internal test set in this study) and the Digital Hand Atlas (DHA) (1202 pediatric hand radiographs; external test set). Each test image underwent seven types of transformations (rotations, flips, brightness, contrast, inversion, laterality marker, and resolution) to represent a range of image appearances, many of which simulate real-world variations. Computational "stress tests" were performed by comparing the model's predictions on baseline and transformed images. Mean absolute differences (MADs) of predicted bone ages compared with radiologist-determined ground truth on baseline versus transformed images were compared using Wilcoxon signed rank tests. The proportion of clinically significant errors (CSEs) was compared using McNemar tests. Results There was no evidence of a difference in MAD of the model on the two baseline test sets (RSNA = 6.8 months, DHA = 6.9 months; P = .05), indicating good model generalization to external data. Except for the RSNA dataset images with an appended radiologic laterality marker (P = .86), there were significant differences in MAD for both the DHA and RSNA datasets among other transformation groups (rotations, flips, brightness, contrast, inversion, and resolution). There were significant differences in proportion of CSEs for 57% of the image transformations (19 of 33) performed on the DHA dataset. Conclusion Although an award-winning pediatric bone age DL model generalized well to curated external images, it had inconsistent predictions on images that had undergone simple transformations reflective of several real-world variations in image appearance. Keywords: Pediatrics, Hand, Convolutional Neural Network, Radiography Supplemental material is available for this article. © RSNA, 2024 See also commentary by Faghani and Erickson in this issue.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating the Robustness of a Deep Learning Bone Age Algorithm to Clinical Image Variation Using Computational Stress Testing.

Abstract

Talk to us

Similar Papers

More From: Radiology. Artificial intelligence

Lead the way for us

Journal: Radiology. Artificial intelligence	Publication Date: Mar 13, 2024
Citations: 2

Similar Papers

Bone age determination using only the index finger: a novel approach using a convolutional neural network compared with human radiologists
Nakul E Reddy ... Wei Zhang
Pediatric Radiology | VOL. 50
Nakul E Reddy, et. al.Nakul E Reddy ... Wei Zhang
20 Dec 2019
Pediatric Radiology | VOL. 50

Generalizability and Bias in a Deep Learning Pediatric Bone Age Prediction Model Using Hand Radiographs.
Elham Beheshtian ... Samantha M Santomartino
Radiology | VOL. 306
Elham Beheshtian, et. al.Elham Beheshtian ... Samantha M Santomartino
01 Feb 2023
Radiology | VOL. 306

Improved Automatic Radiographic Bone Age Prediction with Deep Transfer Learning
Wei Tang ... Gang Wu
-
Wei Tang, et. al.Wei Tang ... Gang Wu
01 Oct 2019
01 Oct 2019

Accuracy and self-validation of automated bone age determination
D D Martin ... M B Ranke
Scientific Reports | VOL. 12
D D Martin, et. al.D D Martin ... M B Ranke
16 Apr 2022
Scientific Reports | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating the Robustness of a Deep Learning Bone Age Algorithm to Clinical Image Variation Using Computational Stress Testing.

Abstract

Talk to us

Similar Papers

More From: Radiology. Artificial intelligence