Abstract

Deep learning in medical applications is limited due to the low availability of large labeled, annotated, or segmented training datasets. With the insufficient data available for model training comes the inability of these networks to learn the fine nuances of the space of possible images in a given medical domain, leading to the possible suppression of important diagnostic features hence making these deep learning systems suboptimal in their performance and vulnerable to adversarialattacks. We formulate a framework to address this lack of labeled data problem. We test this formulation in computed tomographic images domain and present an approach that can synthesize large sets of novel CT images at high resolution across the full Hounsfield (HU)range. Our method only requires a small annotated dataset of lung CT from 30 patients (available online at the TCIA) and a large nonannotated dataset with high resolution CT images from 14k patients (received from NIH, not publicly available). It then converts the small annotated dataset into a large annotated dataset, using a sequence of steps including texture learning via StyleGAN, label learning via U-Net and semi-supervised learning via CycleGAN/Pixel-to-Pixel (P2P) architectures. The large annotated dataset so generated can then be used for the training of deep learning networks for medical applications. It can also be put to use for the synthesis of CT images with varied anatomies that were nonexistent within either of the input datasets, enriching the dataset evenfurther. We demonstrate our framework via lung CT-Scan synthesis along with their novel generated annotations and compared it with other state of the art generative models that only produce images without annotations. We evaluate our framework effectiveness via a visual turing test with help of a few doctors and radiologists. We gain the capability of generating an unlimited amount of annotated CT images. Our approach works for all HU windows with minimal depreciation in anatomical plausibility and hence could be used as a general purpose framework for annotated data augmentation for deep learning applications in medical imaging.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call