AI-based classification of three common malignant tumors in neuro-oncology: A multi-institutional comparison of machine learning and deep learning methods

Girish Bathla,Durjoy Deb Dhruba,Neetu Soni,Yanan Liu,Nicholas B Larson,Blake A Kassmeyer,Suyash Mohan,Douglas Roberts-Wolfe,Saima Rathore,Nam H Le,Honghai Zhang,Milan Sonka,Sarv Priya

doi:10.1016/j.neurad.2023.08.007

Abstract

PurposeTo determine if machine learning (ML) or deep learning (DL) pipelines perform better in AI-based three-class classification of glioblastoma (GBM), intracranial metastatic disease (IMD) and primary CNS lymphoma (PCNSL). MethodologyRetrospective analysis included 502 cases for training (208 GBM, 67 PCNSL and 227 IMD), with external validation on 86 cases (27:27:32). Multiparametric MRI images (T1W, T2W, FLAIR, DWI and T1-CE) were co-registered, resampled, denoised and intensity normalized, followed by semiautomatic 3D segmentation of the enhancing tumor (ET) and peritumoral region (PTR). Model performance was assessed using several ML pipelines and 3D-convolutional neural networks (3D-CNN) using sequence specific masks, as well as combination of masks. All pipelines were trained and evaluated with 5-fold nested cross-validation on internal data followed by external validation using multi-class AUC. ResultsTwo ML models achieved similar performance on test set, one using T2-ET and T2-PTR masks (AUC: 0.885, 95% CI: [0.816, 0.935] and another using T1-CE-ET and FLAIR-PTR mask (AUC: 0.878, CI: [0.804, 0.930]). The best performing DL models achieved an AUC of 0.854, (CI [0.774, 0.914]) on external data using T1-CE-ET and T2-PTR masks, followed by model derived from T1-CE-ET, ADC-ET and FLAIR-PTR masks (AUC: 0.851, CI [0.772, 0.909]). ConclusionBoth ML and DL derived pipelines achieved similar performance. T1-CE mask was used in three of the top four overall models. Additionally, all four models had some mask derived from PTR, either T2WI or FLAIR.

Full Text