ObjectiveDeep-learning algorithms have been widely applied in the field of automatic kidney ultrasound (US) image segmentation. However, obtaining a large number of accurate kidney labels clinically is very difficult and time-consuming. To solve this problem, we have proposed an efficient cross-modal transfer learning method to improve the performance of the segmentation network on a limited labeled kidney US dataset. MethodsWe aim to implement an improved image-to-image translation network called Seg-CycleGAN to generate accurate annotated kidney US data from labeled abdomen computed tomography images. The Seg-CycleGAN framework primarily consists of two structures: (i) a standard CycleGAN network to visually simulate kidney US from a publicly available labeled abdomen computed tomography dataset; (ii) and a segmentation network to ensure accurate kidney anatomical structures in US images. Based on the large number of simulated kidney US images and small number of real annotated kidney US images, we then aimed to employ a fine-tuning strategy to obtain better segmentation results. ResultsTo validate the effectiveness of the proposed method, we tested this method on both normal and abnormal kidney US images. The experimental results showed that the proposed method achieved a segmentation accuracy of 0.8548 in dice similarity coefficient on all testing datasets and 0.7622 on the abnormal testing dataset. ConclusionsCompared with existing data augmentation and transfer learning methods, the proposed method improved the accuracy and generalization of the kidney US image segmentation network on a limited number of training datasets. It therefore has the potential to significantly reduce annotation costs in clinical settings.
Read full abstract