In the era of precision medicine, accurate disease phenotype prediction for heterogeneous diseases, such as cancer, is emerging due to advanced technologies that link genotypes and phenotypes. However, it is difficult to integrate different types of biological data because they are so varied. In this study, we focused on predicting the traits of a blood cancer called Acute Myeloid Leukemia (AML) by combining different kinds of biological data. We used a recently developed method called Omics Generative Adversarial Network (GAN) to better classify cancer outcomes. The primary advantages of a GAN include its ability to create synthetic data that is nearly indistinguishable from real data, its high flexibility, and its wide range of applications, including multi-omics data analysis. In addition, the GAN was effective at combining two types of biological data. We created synthetic datasets for gene activity and DNA methylation. Our method was more accurate in predicting disease traits than using the original data alone. The experimental results provided evidence that the creation of synthetic data through interacting multi-omics data analysis using GANs improves the overall prediction quality. Furthermore, we identified the top-ranked significant genes through statistical methods and pinpointed potential candidate drug agents through in-silico studies. The proposed drugs, also supported by other independent studies, might play a crucial role in the treatment of AML cancer. The code is available on GitHub; https://github.com/SabrinAfroz/omicsGAN_codes?fbclid=IwAR1-/stuffmlE0hyWgSu2wlXo6dYlKUei3faLdlvpxTOOUPVlmYCloXf4Uk9ejK4I
Read full abstract