Abstract
Abstract Introduction: Single nucleotide variants (SNVs) are the most abundant genetic variation in the human genome. Unlike protein-truncating alterations, the functional impact of SNVs can be very difficult to infer from DNA sequencing alone. Although functional techniques such as RNA-Seq have been used to quantify protein expression in normal and tumor samples, up to date, there has not been a robust tool to systematically evaluate the transcriptional perturbations caused by SNVs in the germline and somatic spaces. Here, we describe R2D2, a computational algorithm that jointly analyzes matched germline and tumor DNA and RNA sequencing data to infer the functional impact of coding SNVs. Methods: R2D2 was developed to evaluate the allelic fraction of called SNVs across matched normal and tumor genetic and transcriptomic data to classify these variants into predefined categories of potential functional and biological significance. Ideally, four sets of matched DNA and RNA variants from both normal and tumor tissues would be available to classify both germline and somatic SNVs. However, R2D2 is capable of analyzing various combinations of DNA and RNA sequencing data to infer the functional impact of detected SNVs. R2D2 has also been optimized to work with various commonly used variant calling and annotation tools such as the Genome Analysis Tool Kit (GATK) and Oncotator. Results: We used R2D2 to evaluate the functional impact of germline and somatic SNVs of matched germline and tumor whole exome and transcriptome sequencing data of five patients with primary lung adenocarcinoma. On average, standard variant calling algorithms detected 34,227 (range: 33541-35698) SNVs across all sample types of each patient in our cohort. These variants were just classified as germline and somatic by these pipelines. However, surveying the allelic fraction (AF) of each variant across all sample types using R2D2 detected unexpected patterns of AF deviation, such as imbalanced allelic expression, somatic biallelic inactivation, and aberrant RNA splicing. For example, R2D2 identified somatic loss of heterozygosity of an average of 186 (range: 70-409) SNVs per patient, which can be important when studying potential mechanisms of tumorigenesis. In addition, 13.2% (range: 9.0-17.3) of all detected SNVs had an imbalanced RNA expression where one allele was more selectively expressed in the normal or tumor tissues. Lastly, we also identified a total of 1454 (range: 1349-1546) SNVs per patient that were only detected in the normal or tumor RNA-Seq data. Conclusion: We developed R2D2 to help pinpointing functional genetic SNVs that are of potential clinical significance. R2D2 is an easy-to-implement and flexible module that can be integrated with various variant-calling pipelines to identify unexpected patterns of variant expression. Citation Format: Alma Imamovic, Saud H. AlDubayan, Nathanael Moore, Celine G. Han, Brendan Reardon, Eliezer M. Van Allen. R2D2: An integrated analysis framework to infer the functional impact of single nucleotide variants (SNVs) using matched germline and tumor DNA and RNA sequencing data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 5296.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.