BackgroundTo develop a deep learning model to classify primary bone tumors from preoperative radiographs and compare performance with radiologists.MethodsA total of 1356 patients (2899 images) with histologically confirmed primary bone tumors and pre-operative radiographs were identified from five institutions’ pathology databases. Manual cropping was performed by radiologists to label the lesions. Binary discriminatory capacity (benign versus not-benign and malignant versus not-malignant) and three-way classification (benign versus intermediate versus malignant) performance of our model were evaluated. The generalizability of our model was investigated on data from external test set. Final model performance was compared with interpretation from five radiologists of varying level of experience using the Permutations tests.FindingsFor benign vs. not benign, model achieved area under curve (AUC) of 0•894 and 0•877 on cross-validation and external testing, respectively. For malignant vs. not malignant, model achieved AUC of 0•907 and 0•916 on cross-validation and external testing, respectively. For three-way classification, model achieved 72•1% accuracy vs. 74•6% and 72•1% for the two subspecialists on cross-validation (p = 0•03 and p = 0•52, respectively). On external testing, model achieved 73•4% accuracy vs. 69•3%, 73•4%, 73•1%, 67•9%, and 63•4% for the two subspecialists and three junior radiologists (p = 0•14, p = 0•89, p = 0•93, p = 0•02, p < 0•01 for radiologists 1–5, respectively).InterpretationDeep learning can classify primary bone tumors using conventional radiographs in a multi-institutional dataset with similar accuracy compared to subspecialists, and better performance than junior radiologists.FundingThe project described was supported by RSNA Research & Education Foundation, through grant number RSCH2004 to Harrison X. Bai.
Read full abstract