Abstract

You have accessJournal of UrologyCME1 Apr 2023MP40-10 CREATING HIGH QUALITY SYNTHETIC GENITOURINARY TISSUE IMAGES FROM HISTOLOGY REPOSITORIES TO OVERCOME LIMITATIONS WITH THE USE OF CLINICAL TRIAL DATA IN AI MODELS Derek Van Booven, Yasamin Mirzabeigi, and Himanshu Arora Derek Van BoovenDerek Van Booven More articles by this author , Yasamin MirzabeigiYasamin Mirzabeigi More articles by this author , and Himanshu AroraHimanshu Arora More articles by this author View All Author Informationhttps://doi.org/10.1097/JU.0000000000003278.10AboutPDF ToolsAdd to favoritesDownload CitationsTrack CitationsPermissionsReprints ShareFacebookLinked InTwitterEmail Abstract INTRODUCTION AND OBJECTIVE: Timely diagnosis and assessment of prognosis are challenges for prostate cancer (PCa)—this results in many deaths and increases the overall disease risk and cost of treatment. Although clinical testing strategies are effective, recent advancements in machine learning suggest promising strategies that could allow the generation of synthetic data to avoid the need to rely on using data from clinical trials to develop training models. Our study is the first to use the generative adversarial network technology to generate synthetic digital histology data of high quality from 9 genitourinary organs. METHODS: We downloaded digital pathology images for 9 genitourinary tissues from the GTEx Portal, a repository of 25,713 images from 39 unique tissue classes. Downloaded images were subjected to color normalization within each tissue to ensure that all images have the same range of expected colors and that none are too light or dark, which could lead to post-analysis issues. These were then run through HistoQC to determine baseline quality and then through PyHist for single patch creation. This led to a data repository of over 10,000 patches on average per tissue. These patches were then run through our custom deep convolutional GAN implemented in PyTorch and run on a local GPU cluster to create models for which we can generate new synthetic images. RESULTS: 2700 images were obtained from GTEx. Segmentation was done on these images in 96×96 blocks resulting in 32,411 patches, and 256×256 blocks resulting in 163,916 patches. These patches were used as the training database in a standard deep convolutional GAN (dcGAN) to create synthetic images per tissue. Upon creation, a color gradient PCA was done on the images to show a dropout rate of 21.2%. Images were subjected to further quality control of random manual inspection, where file sizes of a certain threshold were discarded. Post-image inspection, pathologists were also asked to assess the quality of the images resulting in 80% of the images being deemed good enough quality. After the entire QC process, inception distance (FID) was calculated, and a simple classification module was created to determine image similarity within a tissue and image distinctness between tissues. The lowest FID was determined at 54.2 at 5,000 synthetic images. The classification module was able to separate the images with an accuracy of 74%, with similar tissues being classified together being the most common mistake. CONCLUSIONS: We have created a deep convolutional GAN that can create new synthetic images with a reliability rate of 80%. This needs to be enhanced, and quality needs to be improved before a final repository can be created. Source of Funding: AUA Research Scholar Award to HA © 2023 by American Urological Association Education and Research, Inc.FiguresReferencesRelatedDetails Volume 209Issue Supplement 4April 2023Page: e548 Advertisement Copyright & Permissions© 2023 by American Urological Association Education and Research, Inc.MetricsAuthor Information Derek Van Booven More articles by this author Yasamin Mirzabeigi More articles by this author Himanshu Arora More articles by this author Expand All Advertisement PDF downloadLoading ...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call