This study addresses the heterogeneity of Breast Cancer (BC) by employing a Conditional Probabilistic Diffusion Model (CPDM) to synthesize Magnetic Resonance Images (MRIs) based on multi-omic data, including gene expression, copy number variation, and DNA methylation. The lack of paired medical images and genomics data in previous studies presented a challenge, which the CPDM aims to overcome. The well-trained CPDM successfully generated synthetic MRIs for 726 TCGA-BRCA patients, who lacked actual MRIs, using their multi-omic profiles. Evaluation metrics such as Frechet's Inception Distance (FID), Mean Square Error (MSE), and Structural Similarity Index Measure (SSIM) demonstrated the CPDM's effectiveness, with an FID of 2.02, an MSE of 0.02, and an SSIM of 0.59 based on the 15-fold cross-validation. The synthetic MRIs were used to predict clinical attributes, achieving an Area Under the Receiver-Operating-Characteristic curve (AUROC) of 0.82 and an Area Under the Precision-Recall Curve (AUPRC) of 0.84 for predicting ER+/HER2+ subtypes. Additionally, the MRIs served to accurately predicted BC patient survival with a Concordance-index (C-index) score of 0.88, outperforming other baseline models. This research demonstrates the potential of CPDMs in generating MRIs based on BC patients' genomic profiles, offering valuable insights for radiogenomic research and advancements in precision medicine. The study provides a novel approach to understanding BC heterogeneity for early detection and personalized treatment.
Read full abstract