BackgroundWe evaluated the diagnostic efficacy of deep learning radiomics (DLR) and hand-crafted radiomics (HCR) features in differentiating acute and chronic vertebral compression fractures (VCFs).MethodsA total of 365 patients with VCFs were retrospectively analysed based on their computed tomography (CT) scan data. All patients completed MRI examination within 2 weeks. There were 315 acute VCFs and 205 chronic VCFs. Deep transfer learning (DTL) features and HCR features were extracted from CT images of patients with VCFs using DLR and traditional radiomics, respectively, and feature fusion was performed to establish the least absolute shrinkage and selection operator. The MRI display of vertebral bone marrow oedema was used as the gold standard for acute VCF, and the model performance was evaluated using the receiver operating characteristic (ROC).To separately evaluate the effectiveness of DLR, traditional radiomics and feature fusion in the differential diagnosis of acute and chronic VCFs, we constructed a nomogram based on the clinical baseline data to visualize the classification evaluation. The predictive power of each model was compared using the Delong test, and the clinical value of the nomogram was evaluated using decision curve analysis (DCA).ResultsFifty DTL features were obtained from DLR, 41 HCR features were obtained from traditional radiomics, and 77 features fusion were obtained after feature screening and fusion of the two. The area under the curve (AUC) of the DLR model in the training cohort and test cohort were 0.992 (95% confidence interval (CI), 0.983-0.999) and 0.871 (95% CI, 0.805-0.938), respectively. While the AUCs of the conventional radiomics model in the training cohort and test cohort were 0.973 (95% CI, 0.955-0.990) and 0.854 (95% CI, 0.773-0.934), respectively. The AUCs of the features fusion model in the training cohort and test cohort were 0.997 (95% CI, 0.994-0.999) and 0.915 (95% CI, 0.855-0.974), respectively. The AUCs of nomogram constructed by the features fusion in combination with clinical baseline data were 0.998 (95% CI, 0.996–0.999) and 0.946 (95% CI, 0.906–0.987) in the training cohort and test cohort, respectively. The Delong test showed that the differences between the features fusion model and the nomogram in the training cohort and the test cohort were not statistically significant (P values were 0.794 and 0.668, respectively), and the differences in the other prediction models in the training cohort and the test cohort were statistically significant (P < 0.05). DCA showed that the nomogram had high clinical value.ConclusionThe features fusion model can be used for the differential diagnosis of acute and chronic VCFs, and its differential diagnosis ability is improved when compared with that when either radiomics is used alone. At the same time, the nomogram has a high predictive value for acute and chronic VCFs and can be a potential decision-making tool to assist clinicians, especially when a patient is unable to undergo spinal MRI examination.