Abstract

The detection of somatic single nucleotide variants is a crucial component to the characterization of the cancer genome. Mutation calling algorithms thus far have focused on comparing the normal and tumor genomes from the same individual. In recent years, it has become routine for projects like The Cancer Genome Atlas (TCGA) to also sequence the tumor RNA. Here we present RADIA (RNA and DNA Integrated Analysis), a novel computational method combining the patient-matched normal and tumor DNA with the tumor RNA to detect somatic mutations. The inclusion of the RNA increases the power to detect somatic mutations, especially at low DNA allelic frequencies. By integrating an individual’s DNA and RNA, we are able to detect mutations that would otherwise be missed by traditional algorithms that examine only the DNA. We demonstrate high sensitivity (84%) and very high precision (98% and 99%) for RADIA in patient data from endometrial carcinoma and lung adenocarcinoma from TCGA. Mutations with both high DNA and RNA read support have the highest validation rate of over 99%. We also introduce a simulation package that spikes in artificial mutations to patient data, rather than simulating sequencing data from a reference genome. We evaluate sensitivity on the simulation data and demonstrate our ability to rescue back mutations at low DNA allelic frequencies by including the RNA. Finally, we highlight mutations in important cancer genes that were rescued due to the incorporation of the RNA.

Highlights

  • Much of our current understanding of cancer has come from investigating how normal cells are transformed into cancerous cells through the stepwise acquisition of somatic genomic abnormalities

  • We evaluated sensitivity and precision on the endometrial carcinoma and lung adenocarcinoma data using validation data that was generated by The Cancer Genome Atlas (TCGA)

  • All patients in this study provided written informed consent to genomic studies in accordance with local Institutional Review Boards (Table S2) and the policies and guidelines outlined by the Ethics, Law and Policy Group from TCGA

Read more

Summary

Introduction

Much of our current understanding of cancer has come from investigating how normal cells are transformed into cancerous cells through the stepwise acquisition of somatic genomic abnormalities. These events include point mutations, insertions and deletions (INDELs), chromosomal rearrangements, and changes to the copy number of segments of DNA. With the development of RNA Sequencing (RNASeq) [3], TCGA began providing an additional high-throughput tumor sequence dataset. These three datasets consisting of tumor and patient-matched normal DNA and tumor RNA have become a new standard in cancer genomics. RNA-Seq enables one to investigate the consequences of genomic changes in the RNA transcripts they encode to better characterize 1) germline variants, 2) somatic mutations, and 3) variants in the RNA that are not found in the DNA that could be the result of RNA editing [4]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.