Abstract Background: Tumorigenesis is often driven by multiple types of aberrations in DNA leading to diseases of enormous complexity and heterogeneity. With the effort of many consortiums including The Caner Genome Atlas, large-scale multi-platform genomic data are now available, providing an opportunity to more accurately characterize cancer phenotypic heterogeneity on a molecular level and through using multiple technologies. In particular, many gene expression signatures have been developed to define specific cancer phenotypes varying from proliferation rates, to disease subtypes, to features of the tumor microenvironment. These mRNA expression features, along with protein expression, somatic mutations, and clinical features provide a comprehensive molecular portrait of tumors. Integrating multi-platform genomic data together to elucidate the relationship between genotype and phenotypes is critical to understanding the “driver” features of tumor behavior. Methods: In this study, we utilized tumor DNA information and an extensive archive of gene expression signatures as a framework to characterize multiple aspects of tumor biology. We present an integrative computational approach using a genome-wide association analysis, and an Elastic Net prediction method, to analyze the relationship between DNA copy number alterations (CNAs) and gene expression signatures. A multivariable Elastic Net modelling strategy was used to build objective predictive models for multiple molecular features including protein and gene expression patterns, somatic mutations and clinical phenotypes. Results: Across breast cancers we were able to identify known and novel associations between DNA CNAs and many gene expression signatures. Moreover, we were able to accurately predict many gene expression signatures levels within individual tumors using Elastic Net models based upon DNA copy number features alone; these successful models could predict proliferation status and Estrogen-signaling pathway activity. We were also able to build predictive models for many other key phenotypes including intrinsic molecular subtypes, some protein expression features including clinical estrogen receptor status, and somatic mutation status of TP53 and CDH1. This approach was also successfully applied to 24 other tumor types (Pan-Cancer), which identified a number of repeatedly predictable signatures across multiple tumor types including immune cell features in squamous/basal-like cancers. These Elastic Net DNA predictors could also be called from commonly used DNA-based gene panels containing only hundreds of genes, thus facilitating their use for “non-genetic” tumor information that could assist in guiding therapeutic decision making. Conclusions: Our results demonstrate an ability to build DNA CNA-based predictors for multiple complex cancer phenotypes for breast tumors. Once appropriately validated, a whole new set of prognostic and predictive biomarkers could be read out from existing DNA-based gene panels, thus providing more guidance for precision medicine at no additional cost. Citation Format: Youli Xia, Cheng Fan, Katherine A Hoadley, Joel S Parker, Charles M Perou. Genetic determinants of the molecular portraits of epithelial cancers [abstract]. In: Proceedings of the 2019 San Antonio Breast Cancer Symposium; 2019 Dec 10-14; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2020;80(4 Suppl):Abstract nr P4-05-12.
Read full abstract