Developing deep learning models for segmenting medical images in multiple modalities with less data and annotation is an attractive and challenging task, which was previously discussed as being accomplished by complex external frameworks for bridging the gap between different modalities. Exploring the generalization ability of networks in medical images in different modalities could provide more simple and accessible methods, yet comprehensive testing could still be needed. To explore the feasibility and robustness of using computed tomography (CT) images to assist the segmentation of magnetic resonance (MR) images via the generalization, in the segmentation of renal parenchyma of renal cell carcinoma (RCC) patients. Nephrographic CT images and fat-suppressed T2-weighted (fs-T2W) images were retrospectively collected. The pure CT dataset included 116 CT images. Additionally, 240 MR images were randomly divided into subsets A and B. From subset A, three training datasets were constructed, each containing 40, 80, and 120 images, respectively. Similarly, three datasets were constructed from subset B. Subsequently, datasets with mixed modality were created by combining these pure MR datasets with the 116 CT images. The 3D-UNET models for segmenting the renal parenchyma in two steps were trained using these 13 datasets: segmenting kidneys and then the renal parenchyma. These models were evaluated in internal MR (n=120), CT (n=65) validation datasets, and an external validation dataset of CT (n=79), using the mean of the dice similarity coefficient (DSC). To demonstrate the robustness of generalization ability over different proportions of modalities, we compared the models trained with mixed modality in three different proportions and pure MR, using repeated measures analysis of variance (RM-ANOVA). We developed a renal parenchyma volume quantification tool by the trained models. The mean differences and Pearson correlation coefficients between the model segmentation volume and the ground truth segmentation volume were calculated for its evaluation. The mean DSCs of models trained with 116 data in CT in the validation of MR were 0.826, 0.842, and 0.953, respectively, for the predictions of kidney segmentation model on whole image, renal parenchymal segmentation model on kidneys with RCC and without RCC. For all models trained with mixed modality, the means of DSC were above 0.9, in all validations of CT and MR. According to the results of the comparison between models trained with mixed modality and pure MR, the means of DSC of the former were significantly greater or equal to the latter, at all three different proportions of modalities. The differences of volumes were all significantly lower than one-third of the volumetric quantification error of a previous method, and the Pearson correlation coefficients of volumes were all above 0.96 on kidneys with and without RCC of three validations. CT images could be used to assist the segmentation of MR images via the generalization, with or without the supervision of MR data. This ability showed acceptable robustness. A tool for accurately measuring renal parenchymal volume on CT and MR images was established.
Read full abstract