Abstract
Abstract Background: Alternative transcription initiation (ATI) has been frequently observed in cancer suggesting that it contributes to the malignant transformation of the cells. However, ATI remains largely unexplored mainly due to the lack of tools for detecting ATI. We propose a computational method for integrating bulk ATAC-seq and bulk RNA-seq to identify ATI and understand their differential usage between tissues. We hypothesize that differential ATAC-seq intensity can be used as a guide for looking for differential promoter usage, which might enable the identification of novel ATI events in a transcript-agnostic way. Methods: We recently published methods for RNA-seq base-level analysis for identifying structural variations in transcripts. Building on this, we developed a supervised sparse non-negative matrix factorization approach that integrates base-level RNA-seq and ATAC-seq, which aims to identify ATIs as well as characterize the latent structure of the underlying isoforms. This method equips many unique features that can be useful for the identification of novel ATIs. Because the method scans an entire collection of DNA accessible regions provided by ATAC-seq, it enables a comprehensive screening of novel ATI candidates that are not limited to the known promoters. Additionally, the uncompressed view of base-level RNA-seq allows us to infer the structure of individual isoforms independently of known gene annotation. The predicted isoforms can be further used to deconvolute base-level RNA-seq of individual cases into the isoforms and thus infer expression levels of each isoform. Results: We applied this method to a sub-cohort (N=350) of TCGA pan-cancer samples in which both ATAC-seq and RNA-seq are available. Empiric comparison to existing methods confirmed known true positive ATIs including important cancer genes such as CDKN2A and ALK. In particular, the method successfully identified the ATIs including some challenging cases such as ATIs located at internal introns or at constitutive exons in other transcripts. The deconvolution analysis applied to the extended cohort to all available ~10,000 TCGA pan-caner samples across 32 tissue types revealed that ATIs were predominantly differentiated across tissue types. By applying the method to a set of transcription regulator genes, we identified ~1% of the genes had ATIs including the novel cancer-specific genes with known ATIs that had not been previously reported as relevant to cancer. Additionally, we provide examples in which the isoform’s function relative to cancer appears mechanistically interesting. Conclusion: We propose a multi-omics integration method that is independent of known gene annotation, enabling a robust identification of ATIs. Our results strongly demonstrate our ability to pick up known as well as novel ATIs that are otherwise difficult to identify by existing methods. Citation Format: Hyo Young Choi, Won-Young Choi, David N. Hayes. Novel framework for systematically detecting alternative transcript initiation by integrating ATAC-seq and RNA-seq [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 2076.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.