Abstract

BackgroundHigh throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. Tools have been developed to infer single nucleotide variants (SNVs) from both DNA and RNA sequencing data. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue.ResultsThrough careful quality control and analysis of the SNVs, we found little difference between DNA-DNA pairs (1%–2%). However, between DNA-RNA pairs, SNV differences ranged anywhere from 10% to 20%.ConclusionsOnly a small portion of these differences can be explained by RNA editing. Instead, the majority of the DNA-RNA differences should be attributed to technical errors from sequencing and post-processing of RNAseq data. Our analysis results suggest that SNV detection using RNAseq is subject to high false positive rates.

Highlights

  • High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution

  • Single nucleotide variants (SNVs) are often measured in human specimens to correlate with other phenotypic variables

  • There are two major classes of single nucleotide variants (SNVs): germline mutations, which are inherited with one allele from each parent, and somatic mutations which are acquired at late stage of life

Read more

Summary

Introduction

High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue. The detection of SNVs can be achieved through a variety of methods, including real time polymerase chain reaction (RT-PCR), genotyping array, Sanger sequencing, and high throughput sequencing. All of these methods use genomic DNA as the input source.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.