Automatic and accurate segmentation of the whole heart structure from 3D cardiac images plays an important role in helping physicians diagnose and treat cardiovascular disease. However, the time-consuming and laborious manual labeling of the heart images results in the inefficiency of utilizing the existing CT or MRI for training the deep learning network, which decrease the accuracy of whole heart segmentation. However, multi-modality data contains multi-level information of cardiac images due to different imaging mechanisms, which is beneficial to improve the segmentation accuracy. Therefore, this paper proposes a cascaded framework with cross-modality transfer learning for whole heart segmentation (CM-TranCaF), which consists of three key modules: modality transfer network (MTN), U-shaped multi-attention network (MAUNet) and spatial configuration network (SCN). In MTN, MRI images are transferred from MRI domain to CT domain, to increase the data volume by adopting the idea of adversarial training. The MAUNet is designed based on UNet, while the attention gates (AGs) are integrated into the skip connection to reduce the weight of background pixels. Moreover, to solve the problem of boundary blur, the position attention block (PAB) is also integrated into the bottom layer to aggregate similar features. Finally, the SCN is used to finetune the segmentation results by utilizing the anatomical information between different cardiac substructures. By evaluating the proposed method on the dataset of the MM-WHS challenge, CM-TranCaF achieves a Dice score of 91.1% on the testing dataset. The extensive experimental results prove the effectiveness of the proposed method compared to other state-of-the-art methods.
Read full abstract