External validation of deep learning-based bone-age software: a preliminary study with real world data

Winnah Wu-In Lea,Woo-Young Kang,Eun-Jin Noh,Suk-Joo Hong,Hyo-Kyoung Nam,Ze-Pa Yang

doi:10.1038/s41598-022-05282-z

Winnah Wu-In Lea, Woo-Young Kang + Show 4 more

Open Access

https://doi.org/10.1038/s41598-022-05282-z

Copy DOI

Abstract

Artificial intelligence (AI) is increasingly being used in bone-age (BA) assessment due to its complicated and lengthy nature. We aimed to evaluate the clinical performance of a commercially available deep learning (DL)–based software for BA assessment using a real-world data. From Nov. 2018 to Feb. 2019, 474 children (35 boys, 439 girls, age 4–17 years) were enrolled. We compared the BA estimated by DL software (DL-BA) with that independently estimated by 3 reviewers (R1: Musculoskeletal radiologist, R2: Radiology resident, R3: Pediatric endocrinologist) using the traditional Greulich–Pyle atlas, then to his/her chronological age (CA). A paired t-test, Pearson’s correlation coefficient, Bland–Altman plot, mean absolute error (MAE) and root mean square error (RMSE) were used for the statistical analysis. The intraclass correlation coefficient (ICC) was used for inter-rater variation. There were significant differences between DL-BA and each reviewer’s BA (P < 0.025), but the correlation was good with one another (r = 0.983, P < 0.025). RMSE (MAE) values were 10.09 (7.21), 10.76 (7.88) and 13.06 (10.06) months between DL-BA and R1, R2, R3 BA. Compared with the CA, RMSE (MAE) values were 13.54 (11.06), 15.18 (12.11), 16.19 (12.78) and 19.53 (17.71) months for DL-BA, R1, R2, R3 BA, respectively. Bland–Altman plots revealed the software and reviewers’ tendency to overestimate the BA in general. ICC values between 3 reviewers were 0.97, 0.85 and 0.86, and the overall ICC value was 0.93. The BA estimated by DL-based software showed statistically similar, or even better performance than that of reviewers’ compared to the chronological age in the real world clinic.

Highlights

Artificial intelligence (AI) is increasingly being used in bone-age (BA) assessment due to its complicated and lengthy nature
For a deep learning based automatic software system to be used in clinical settings, a carefully designed external validation study is needed with datasets consisted of newly recruited patients or those from other institutions that exhibit similar characteristics to patients in a real-world s etting[11]
In the analysis with the deep learning (DL)-BA, the results showed that between Reviewer 1 (R1)-estimated BA (R1-BA) and DL-BA, paired t-test had P value of less than 0.025, which implies significant differences between them

Summary

Introduction

Artificial intelligence (AI) is increasingly being used in bone-age (BA) assessment due to its complicated and lengthy nature. The BA estimated by DL-based software showed statistically similar, or even better performance than that of reviewers’ compared to the chronological age in the real world clinic. In the TW method, each bone of the left hand and wrist is given a score in comparison with a standard set of bones at different stages of maturation, and the total score is calculated to determine the BA. As both processes are rather time-consuming and the values tend to vary depending on the clinician’s experience, there have been optimization issues regarding their uses in BA assessment. For a deep learning based automatic software system to be used in clinical settings, a carefully designed external validation study is needed with datasets consisted of newly recruited patients or those from other institutions that exhibit similar characteristics to patients in a real-world s etting[11]

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Jan 24, 2022
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

External validation of deep learning-based bone-age software: a preliminary study with real world data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Clinical application of artificial intelligence in longitudinal image analysis of bone age among GHD patients.
Lina Zhang ... Zulin Liu
Frontiers in Pediatrics | VOL. 10
Lina Zhang, et. al.Lina Zhang ... Zulin Liu
11 Nov 2022
Frontiers in Pediatrics | VOL. 10

Comparison of Three CNN Models Applied in Bone Age Assessment of Pelvic Radiographs of Adolescents
T A Liu ... H Zhao
Fa yi xue za zhi | VOL. 36
T A Liu, et. al.T A Liu ... H Zhao
01 Oct 2020
Fa yi xue za zhi | VOL. 36

DNA methylation markers in combination with skeletal and dental ages to improve age estimation in children
Lei Shi ... Xiaoming Shen
Forensic Science International: Genetics | VOL. 33
Lei Shi, et. al.Lei Shi ... Xiaoming Shen
17 Nov 2017
Forensic Science International: Genetics | VOL. 33

Multi-State Online Estimation of Lithium-Ion Batteries Based on Multi-Task Learning
Xiang Bao ... Haofeng Liu
Energies | VOL. 16
Xiang Bao, et. al.Xiang Bao ... Haofeng Liu
25 Mar 2023
Energies | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

External validation of deep learning-based bone-age software: a preliminary study with real world data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports