The volume measurement of intracerebral hemorrhage (ICH) and intraventricular hemorrhage (IVH) provides critical information for precise treatment of patients with spontaneous ICH but remains a big challenge, especially for IVH segmentation. However, the previously proposed ICH and IVH segmentation tools lack external validation and segmentation quality assessment. This study aimed to develop a robust deep learning model for the segmentation of ICH and IVH with external validation, and to provide quality assessment for IVH segmentation. In this study, a Residual Encoding Unet (REUnet) for the segmentation of ICH and IVH was developed using a dataset composed of 977 CT images (all contained ICH, and 338 contained IVH; a five-fold cross-validation procedure was adopted for training and internal validation), and externally tested using an independent dataset consisting of 375 CT images (all contained ICH, and 105 contained IVH). The performance of REUnet was compared with six other advanced deep learning models. Subsequently, three approaches, including Prototype Segmentation (ProtoSeg), Test Time Dropout (TTD), and Test Time Augmentation (TTA), were employed to derive segmentation quality scores in the absence of ground truth to provide a way to assess the segmentation quality in real practice. For ICH segmentation, the median (lower-quantile-upper quantile) of Dice scores obtained from REUnet were 0.932 (0.898-0.953) for internal validation and 0.888 (0.859-0.916) for external test, both of which were better than those of other models while comparable to that of nnUnet3D in external test. For IVH segmentation, the Dice scores obtained from REUnet were 0.826 (0.757-0.868) for internal validation and 0.777 (0.693-0.827) for external tests, which were better than those of all other models. The concordance correlation coefficients between the volumes estimated from the REUnet-generated segmentations and those from the manual segmentations for both ICH and IVH ranged from 0.944 to 0.987. For IVH segmentation quality assessment, the segmentation quality score derived from ProtoSeg was correlated with the Dice Score (Spearman r=0.752 for the external test) and performed better than those from TTD (Spearman r=0.718) and TTA (Spearman r=0.260) in the external test. By setting a threshold to the segmentation quality score, we were able to identify low-quality IVH segmentation results by ProtoSeg. The proposed REUnet offers a promising tool for accurate and automated segmentation of ICH and IVH, and for effective IVH segmentation quality assessment, and thus exhibits the potential to facilitate therapeutic decision-making for patients with spontaneous ICH in clinical practice.
Read full abstract