Abstract

Accurate segmentation of the breast is required for breast density estimation and the assessment of background parenchymal enhancement, both of which have been shown to be related to breast cancer risk. The MRI breast segmentation task is challenging, and recent work has demonstrated that convolutional neural networks perform well for this task. In this study, we have investigated the performance of several two-dimensional (2D) U-Net and three-dimensional (3D) U-Net configurations using both fat-suppressed and nonfat-suppressed images. We have also assessed the effect of changing the number and quality of the ground truth segmentations. We designed eight studies to investigate the effect of input types and the dimensionality of the U-Net operations for the breast MRI segmentation. Our training data contained 70 whole breast volumes of T1-weighted sequences without fat suppression (WOFS) and with fat suppression (FS). For each subject, we registered the WOFS and FS volumes together before manually segmenting the breast to generate ground truth. We compared four different input types to the U-nets: WOFS, FS, MIXED (WOFS and FS images treated as separate samples), and MULTI (WOFS and FS images combined into a single multichannel image). We trained 2D U-Nets and 3D U-Nets with these data, which resulted in our eight studies (2D-WOFS, 3D-WOFS, 2D-FS, 3D-FS, 2D-MIXED, 3D-MIXED, 2D-MULTI, and 3D-MULT). For each of these studies, we performed a systematic grid search to tune the hyperparameters of the U-Nets. A separate validation set with 15 whole breast volumes was used for hyperparameter tuning. We performed Kruskal-Walis test on the results of our hyperparameter tuning and did not find a statistically significant difference in the ten top models of each study. For this reason, we chose the best model as the model with the highest mean dice similarity coefficient (DSC) value on the validation set. The reported test results are the results of the top model of each study on our test set which contained 19 whole breast volumes annotated by three readers fused with the STAPLE algorithm. We also investigated the effect of the quality of the training annotations and the number of training samples for this task. The study with the highest average DSC result was 3D-MULTI with 0.96±0.02. The second highest average is 2D WOFS (0.96±0.03), and the third is 2D MULTI (0.96±0.03). We performed the Kruskal-Wallis one-way ANOVA test with Dunn's multiple comparison tests using Bonferroni P-value correction on the results of the selected model of each study and found that 3D-MULTI, 2D-MULTI, 3D-WOFS, 2D-WOFS, 2D-FS, and 3D-FS were not statistically different in their distributions, which indicates that comparable results could be obtained in fat-suppressed and nonfat-suppressed volumes and that there is no significant difference between the 3D and 2D approach. Our results also suggested that the networks trained on single sequence images or multiple sequence images organized in multichannel images perform better than the models trained on a mixture of volumes from different sequences. Our investigation of the size of the training set revealed that training a U-Net in this domain only requires a modest amount of training data and results obtained with 49 and 70 training datasets were not significantly different. To summarize, we investigated the use of 2D U-Nets and 3D U-Nets for breast volume segmentation in T1 fat-suppressed and without fat-suppressed volumes. Although our highest score was obtained in the 3D MULTI study, when we took advantage of information in both fat-suppressed and nonfat-suppressed volumes and their 3D structure, all of the methods we explored gave accurate segmentations with an average DSC on >94% demonstrating that the U-Net is a robust segmentation method for breast MRI volumes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call