The purpose of this study was to develop and validate a new feature fusion algorithm to improve the classification performance of benign and malignant ground-glass nodules (GGNs) based on deep learning. We retrospectively collected 385 cases of GGNs confirmed by surgical pathology from three hospitals. We utilized 239 GGNs from Hospital 1 as the training and internal validation set, and 115 and 31 GGNs from Hospital 2 and Hospital 3, respectively, as external test sets 1 and 2. Among these GGNs, 172 were benign and 203 were malignant. First, we evaluated clinical and morphological features of GGNs at baseline chest CT and simultaneously extracted whole-lung radiomics features. Then, deep convolutional neural networks (CNNs) and backpropagation neural networks (BPNNs) were applied to extract deep features from whole-lung CT images, clinical, morphological features, and whole-lung radiomics features separately. Finally, we integrated these four types of deep features using an attention mechanism. Multiple metrics were employed to evaluate the predictive performance of the model. The deep learning model integrating clinical, morphological, radiomics and whole lung CT image features with attention mechanism (CMRI-AM) achieved the best performance, with area under the curve (AUC) values of 0.941 (95% CI: 0.898-0.972), 0.861 (95% CI: 0.823-0.882), and 0.906 (95% CI: 0.878-0.932) on the internal validation set, external test set 1, and external test set 2, respectively. The AUC differences between the CMRI-AM model and other feature combination models were statistically significant in all three groups (all p<0.05). Our experimental results demonstrated that (1) applying attention mechanism to fuse whole-lung CT images, radiomics features, clinical, and morphological features is feasible, (2) clinical, morphological, and radiomics features provide supplementary information for the classification of benign and malignant GGNs based on CT images, and (3) utilizing baseline whole-lung CT features to predict the benign and malignant of GGNs is an effective method. Therefore, optimizing the fusion of baseline whole-lung CT features can effectively improve the classification performance of GGNs.
Read full abstract