The Performance of a Deep Learning-Based Automatic Measurement Model for Measuring the Cardiothoracic Ratio on Chest Radiographs.

Donguk Kim,Jong Hyuk Lee,Si Yeong Yang,Wonju Hong,Myoung-Jin Jang,Chan Su Lee,Chang Min Park,Jongsoo Park

doi:10.3390/bioengineering10091077

Abstract

Prior studies on models based on deep learning (DL) and measuring the cardiothoracic ratio (CTR) on chest radiographs have lacked rigorous agreement analyses with radiologists or reader tests. We validated the performance of a commercially available DL-based CTR measurement model with various thoracic pathologies, and performed agreement analyses with thoracic radiologists and reader tests using a probabilistic-based reference. This study included 160 posteroanterior view chest radiographs (no lung or pleural abnormalities, pneumothorax, pleural effusion, consolidation, and n = 40 in each category) to externally test a DL-based CTR measurement model. To assess the agreement between the model and experts, intraclass or interclass correlation coefficients (ICCs) were compared between the model and two thoracic radiologists. In the reader tests with a probabilistic-based reference standard (Dawid-Skene consensus), we compared diagnostic measures-including sensitivity and negative predictive value (NPV)-for cardiomegaly between the model and five other radiologists using the non-inferiority test. For the 160 chest radiographs, the model measured a median CTR of 0.521 (interquartile range, 0.446-0.59) and a mean CTR of 0.522 ± 0.095. The ICC between the two thoracic radiologists and between the model and two thoracic radiologists was not significantly different (0.972 versus 0.959, p = 0.192), even across various pathologies (all p-values > 0.05). The model showed non-inferior diagnostic performance, including sensitivity (96.3% versus 97.8%) and NPV (95.6% versus 97.4%) (p < 0.001 in both), compared with the radiologists for all 160 chest radiographs. However, it showed inferior sensitivity in chest radiographs with consolidation (95.5% versus 99.9%; p = 0.082) and NPV in chest radiographs with pleural effusion (92.9% versus 94.6%; p = 0.079) and consolidation (94.1% versus 98.7%; p = 0.173). While the sensitivity and NPV of this model for diagnosing cardiomegaly in chest radiographs with consolidation or pleural effusion were not as high as those of the radiologists, it demonstrated good agreement with the thoracic radiologists in measuring the CTR across various pathologies.

Full Text