Purpose. To determine glioma grading by applying radiomic analysis or deep convolutional neural networks (DCNN) and to benchmark both approaches on broader validation sets. Methods. Seven public datasets were considered: (1) low-grade glioma or high-grade glioma (369 patients, BraTS’20) (2) well-differentiated liposarcoma or lipoma (115, LIPO); (3) desmoid-type fibromatosis or extremity soft-tissue sarcomas (203, Desmoid); (4) primary solid liver tumors, either malignant or benign (186, LIVER); (5) gastrointestinal stromal tumors (GISTs) or intra-abdominal gastrointestinal tumors radiologically resembling GISTs (246, GIST); (6) colorectal liver metastases (77, CRLM); and (7) lung metastases of metastatic melanoma (103, Melanoma). Radiomic analysis was performed on 464 (2016) radiomic features for the BraTS’20 (others) datasets respectively. Random forests (RF), Extreme Gradient Boosting (XGBOOST) and a voting algorithm comprising both classifiers were tested. The parameters of the classifiers were optimized using a repeated nested stratified cross-validation process. The feature importance of each classifier was computed using the Gini index or permutation feature importance. DCNN was performed on 2D axial and sagittal slices encompassing the tumor. A balanced database was created, when necessary, using smart slices selection. ResNet50, Xception, EficientNetB0, and EfficientNetB3 were transferred from the ImageNet application to the tumor classification and were fine-tuned. Five-fold stratified cross-validation was performed to evaluate the models. The classification performance of the models was measured using multiple indices including area under the receiver operating characteristic curve (AUC). Results. The best radiomic approach was based on XGBOOST for all datasets; AUC was 0.934 (BraTS’20), 0.86 (LIPO), 0.73 (LIVER), (0.844) Desmoid, 0.76 (GIST), 0.664 (CRLM), and 0.577 (Melanoma) respectively. The best DCNN was based on EfficientNetB0; AUC was 0.99 (BraTS’20), 0.982 (LIPO), 0.977 (LIVER), (0.961) Desmoid, 0.926 (GIST), 0.901 (CRLM), and 0.89 (Melanoma) respectively. Conclusion. Tumor classification can be accurately determined by adapting state-of-the-art machine learning algorithms to the medical context.
Read full abstract