Abstract RNA-seq is commonly used to profile tumors through transcript quantification and detection of alternative transcripts and gene fusions. Recent advancements enabled assessment of allele-specific gene expression. In this study, we characterize chromosomal allelic-imbalances (AI) through the identification of haplotype-specific patterns in RNA-seq data. We processed SNP-array, whole-exome sequencing (WES), and RNA-seq data for two cancer sites from The Cancer Genome Atlas (TCGA): colon adenocarcinoma (COAD) and lung adenocarcinoma (LUAD). For the WES and RNA-seq data, we phased variants in the 1000 Genomes Project database and then used the hapLOHseq algorithm to detect AI. AI events detected in the Affymetrix SNP array 6.0 data were used as a gold standard, as it queries a much larger proportion of heterozygous sites relative to WES and RNA-seq. A comparison of AI calls derived from the TCGA COAD RNA-seq and WES data to the gold standard events inferred from SNP-array yielded similar specificity estimates of 92% and 96% in WES and RNA-seq, respectively. However, with a sensitivity of 43%, RNA-seq derived AI calls had decreased overlap with gold standard events, compared to WES derived AI calls with 86% sensitivity. The TCGA LUAD analysis resulted in specificity estimates of 88% and 95% in WES and RNA-seq, respectively, and sensitivity estimates of 84% and 35%, respectively, for identifying gold standard events. Events detected exclusively in RNA-seq are classified as false positives in our sensitivity/specificity calculations. Possible validity of those events was assessed by investigating clinical correlations between RNA-seq only events and clinical phenotypes such as cancer stage, overall survival, and mutations in 21 bona fide tumor suppressor genes (TSG) involved in chromatin modification. We found a marginally significant correlation between the number of RNA-seq only events and the number of mutations in this set of TSG genes (p value = 0.07, corr. = 0.19) for the COAD study; however, we did not observe such a correlation in the LUAD study. This may be explained by a potential larger influence of extrinsic environmental factors, smoking, leading to epigenetic changes in the LUAD. In the LUAD, we observed a potential clinical association between the existence of RNA-specific events and worse overall survival (p-value = 0.15). Although at lower sensitivity levels, our results suggest that RNA-seq may be used to detect AI in the absence of available DNA profiles. Clinically, our results may indicate that higher RNA-specific AI load may have a worse overall prognosis and we are exploring this further in additional TCGA datasets. The AI we detect in the RNA-seq samples may reflect long-range epigenetic dysregulation. RNA-exclusive AI may be caused by mutations in cancer driver and lineage genes may lead to selective cis-epigenetic chromatin alterations affecting large genomic regions. Citation Format: Zuhal Ozcan, F. Anthony San Lucas, Yasminka Yakubek, Yihua Liu, Richard G. Fowler, Paul A. Scheet. The University of Texas M.D. Detection of chromosomal allelic imbalances from RNAseq data using hapLOHseq [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 1662.
Read full abstract