Abstract

ObjectivesDiagnostic accuracy of artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXR) is limited by the noisy annotation quality of public training data and confounding thoracic tubes (TT). We hypothesize that in-image annotations of the dehiscent visceral pleura for algorithm training boosts algorithm’s performance and suppresses confounders.MethodsOur single-center evaluation cohort of 3062 supine CXRs includes 760 PTX-positive cases with radiological annotations of PTX size and inserted TTs. Three step-by-step improved algorithms (differing in algorithm architecture, training data from public datasets/clinical sites, and in-image annotations included in algorithm training) were characterized by area under the receiver operating characteristics (AUROC) in detailed subgroup analyses and referenced to the well-established “CheXNet” algorithm.ResultsPerformances of established algorithms exclusively trained on publicly available data without in-image annotations are limited to AUROCs of 0.778 and strongly biased towards TTs that can completely eliminate algorithm’s discriminative power in individual subgroups. Contrarily, our final “algorithm 2” which was trained on a lower number of images but additionally with in-image annotations of the dehiscent pleura achieved an overall AUROC of 0.877 for unilateral PTX detection with a significantly reduced TT-related confounding bias.ConclusionsWe demonstrated strong limitations of an established PTX-detecting AI algorithm that can be significantly reduced by designing an AI system capable of learning to both classify and localize PTX. Our results are aimed at drawing attention to the necessity of high-quality in-image localization in training data to reduce the risks of unintentionally biasing the training process of pathology-detecting AI algorithms.Key Points• Established pneumothorax-detecting artificial intelligence algorithms trained on public training data are strongly limited and biased by confounding thoracic tubes.• We used high-quality in-image annotated training data to effectively boost algorithm performance and suppress the impact of confounding thoracic tubes.• Based on our results, we hypothesize that even hidden confounders might be effectively addressed by in-image annotations of pathology-related image features.

Highlights

  • Chest radiography is the most commonly performed diagnostic imaging procedure throughout the world and has a relevant impact on public health [1, 2]

  • Based on our results, we hypothesize that even hidden confounders might be effectively addressed by in-image annotations of pathology-related image features

  • To graphic receiver operating characteristic (ROC) illustrations (Figs. 2, 3, and 5; Supplementary Figure 1), the most relevant resulting area under the receiver operating characteristics (AUROC) will be compared by box plots in summarizing Fig. 4

Read more

Summary

Introduction

Chest radiography is the most commonly performed diagnostic imaging procedure throughout the world and has a relevant impact on public health [1, 2]. Several AI algorithms, trained on publicly available datasets, have demonstrated potential to detect PTX in CXRs with diagnostic accuracies that have been quantified by area under receiver operating characteristics (AUROCs) of up to 0.937 [8,9,10,11,12,13]. In studies evaluating these algorithms, the performance was evaluated on data derived from public datasets [8, 14, 15]. Limited labeling within these datasets does not allow a detailed subgroup analysis or the identification of confounders and their impact on the performance of AI algorithms

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call