State-of-the-art automated segmentation methods achieve exceptionally high performance on the Brain Tumor Segmentation (BraTS) challenge, a dataset of uniformly processed and standardized magnetic resonance generated images (MRIs) of gliomas. However, a reasonable concern is that these models may not fare well on clinical MRIs that do not belong to the specially curated BraTS dataset. Research using the previous generation of deep learning models indicates significant performance loss on cross-institutional predictions. Here, we evaluate the cross-institutional applicability and generalzsability of state-of-the-art deep learning models on new clinicaldata. We train a state-of-the-art 3D U-Net model on the conventional BraTS dataset comprising low- and high-grade gliomas. We then evaluate the performance of this model for automatic tumor segmentation of brain tumors on in-house clinical data. This dataset contains MRIs of different tumor types, resolutions, and standardization than those found in the BraTS dataset. Ground truth segmentations to validate the automated segmentation for in-house clinical data were obtained from expert radiationoncologists. We report average Dice scores of 0.764, 0.648, and 0.61 for the whole tumor, tumor core, and enhancing tumor, respectively, in the clinical MRIs. These means are higher than numbers reported previously on same institution and cross-institution datasets of different origin using different methods. There is no statistically significant difference when comparing the dice scores to the inter-annotation variability between two expert clinical radiation oncologists. Although performance on the clinical data is lower than on the BraTS data, these numbers indicate that models trained on the BraTS dataset have impressive segmentation performance on previously unseen images obtained at a separate clinical institution. These images differ in the imaging resolutions, standardization pipelines, and tumor types from the BraTSdata. State-of-the-art deep learning models demonstrate promising performance on cross-institutional predictions. They considerably improve on previous models and can transfer knowledge to new types of brain tumors without additionalmodeling.
Read full abstract