Abstract Although genome-wide association studies of prostate cancer have revealed numerous genetic loci associated with disease risk, efforts to characterize the causal mechanisms of germline genetic variants have been unsystematic in their approach. While select studies have evaluated the effects of the most significantly associated variants on the expression of highly proximal genes, both the complex nature of gene expression regulation, as well as the majority fraction of prostate cancer risk unaccounted for by genome-wide association study loci, suggest that certain risk genes and variants remain undiscovered. In pursuit of such causal genetic factors, we analyzed germline genotype data paired with gene expression data from normal prostate tissue in 471 subjects to build regularized statistical models of how gene expression of the prostate is influenced by cis-genetic variation. By applying these models to genome-wide association study data, predictions of transcript expression levels were generated transcriptome-wide for over 230,000 male subjects (14,616 prostate cancer cases, 219,339 controls) from the UK Biobank, GERA Cohort, ProHealth Study, and California Men's Health Study. Finally, these transcript abundances were assessed in relation to prostate cancer case-control status to identify those genes with expression levels most predictive of prostate cancer diagnosis. Among the 12,014 genes for which models were successfully fit, 29 were transcriptome-wide significant via Bonferroni correction, while another 9 exhibited p-values that were suggestive. Moreover, 19 of the 38 significant or suggestive genes replicated in the validation cohort of Kaiser Permanente health plan members (GERA, ProHealth, CMHS). At previously implicated risk loci for prostate cancer, numerous genes were significantly associated with disease risk, including MSMB, NCOA4, and AGAP7 at 10q11.22, HNF1B at 17q12, as well as POU5F1B and PCAT1 at 8q24.21. Furthermore, several genes whose expression levels were not previously implicated were associated with prostate cancer risk. The most noteworthy of these was TMPRSS2, part of the TMPRSS2-ERG gene-fusion, which represents the most frequent somatic alteration discovered in prostate cancer tumors. When conditioned on a previously reported prostate cancer GWAS cis-variant at 21q22.3, TMPRSS2 retained transcriptome-wide significance. Further analysis of the genetic effects on TMPRSS2-ERG expression in prostate cancer tumors from The Cancer Genome Atlas (TCGA) suggested a novel mechanism by which germline and somatic factors cooperate to increase disease risk. By systematically characterizing gene expression of the prostate in this transcriptome-wide association study (TWAS) of prostate cancer, we identified prostate cancer risk genes and propose a novel germline-somatic interaction mechanism of cancer risk. Citation Format: Nima C. Emami, Joshua Hoffman, Elad Ziv, John S. Witte. Imputation of the prostate cancer transcriptome in over 230,000 men reveals novel germline-somatic interaction mechanism of cancer risk [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2968.
Read full abstract