Abstract
Automatic segmentation of medical lesions is a prerequisite for efficient clinic analysis. Segmentation algorithms for multimodal medical images have received much attention in recent years. Different strategies for multimodal combination (or fusion), such as probability theory, fuzzy models, belief functions, and deep neural networks, have also been developed. In this paper, we propose the modality weighted UNet (MW-UNet) and attention-based fusion method to combine multimodal images for medical lesion segmentation. MW-UNet is a multimodal fusion method which is based on UNet, but we use a shallower layer and fewer feature map channels to reduce the amount of network parameters, and our method uses the new multimodal fusion method called fusion attention. It uses weighted sum rule and fusion attention to combine feature maps in intermediate layers. During training, all the weight parameters are updated through backpropagation like other parameters in the network. We also incorporate residual blocks into MW-UNet to further improve segmentation performance. The comparison between the automatic multimodal lesion segmentations and the manual contours was quantified by (1) five metrics including Dice, 95% Hausdorff Distance (HD95), volumetric overlap error (VOE), relative volume difference (RVD), and mean-Intersection-over-Union (mIoU); (2) Number of parameters and flops to calculate the complexity of thenetwork. The proposed method is verified on ZJCHD, which is the data set of contrast-enhanced computed tomography (CECT) for Liver Lesion Segmentation taken from Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Hangzhou, China. For accuracy evaluation, we use 120 patients with liver lesions from ZJCHD, of which 100 are used for fourfold cross-validation (CV) and 20 are used for hold-out (HO) test. The mean Dice was and for HO and CV tests, respectively. The corresponding HD95, VOE, RVD, and mIoU of the two tests are 1.95 ± 1.83 and 2.67 ± 3.35 mm, 13.11 ± 15.83 and , 12.20 ± 18.20 and , and 83.79 ± 15.83 and . The parameters and flops of our method is 4.04 M and 18.36 G,respectively. The results show that our method performs well on multimodal liver lesion segmentation. It can be easily extended to other multimodal data sets and other networks for multimodal fusion. Our method is the potential to provide doctors with multimodal annotations and assist them with clinicaldiagnosis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.