Abstract

RNA sequencing (RNA-seq) is the current gold-standard method to quantify gene expression for expression quantitative trait locus (eQTL) studies. However, a potential caveat in these studies is that RNA-seq reads carrying the non-reference allele of variant loci can have lower probability to map correctly to the reference genome, which could bias gene quantifications and cause false positive eQTL associations. In this study, we analyze the effect of this allelic mapping bias in eQTL discovery. We simulate RNA-seq read mapping over 9.5 M common SNPs and indels, with 15.6% of variants showing biased mapping rate for reference versus non-reference reads. However, removing potentially biased RNA-seq reads from an eQTL dataset of 185 individuals has a very small effect on gene and exon quantifications and eQTL discovery. We detect only a handful of likely false positive eQTLs, and overall eQTL SNPs show no significant enrichment for high mapping bias. Our results suggest that RNA-seq quantifications are generally robust against allelic mapping bias, and that this does not have a severe effect on eQTL discovery. Nevertheless, we provide our catalog of putatively biased loci to allow better controlling for mapping bias to obtain more accurate results in future RNA-seq studies.

Highlights

  • RNA sequencing (RNA-seq) is the current gold-standard method to quantify gene expression for expression quantitative trait locus studies

  • Gene expression quantification from RNA-seq can potentially be biased by genetic variation affecting the mapping of RNA-seq reads, as those reads that carry the nonreference allele can have a lower probability of mapping correctly to the reference genome [8]

  • In this study we examined if allelic mapping bias of RNA-seq reads is a confounding factor in expression quantitative trait locus (eQTL) analysis

Read more

Summary

Introduction

RNA sequencing (RNA-seq) is the current gold-standard method to quantify gene expression for expression quantitative trait locus (eQTL) studies. A potential caveat in these studies is that RNA-seq reads carrying the non-reference allele of variant loci can have lower probability to map correctly to the reference genome, which could bias gene quantifications and cause false positive eQTL associations. Gene expression quantification from RNA-seq can potentially be biased by genetic variation affecting the mapping of RNA-seq reads, as those reads that carry the nonreference allele can have a lower probability of mapping correctly to the reference genome [8]. This is analogous to single-nucleotide polymorphisms (SNPs) in probes of expression microarrays, which is a relatively well-documented technical problem [9,10]. A small proportion of eQTL associations are likely to be false positives due to allelic mapping bias, and correcting for these effects will lead to more accurate results

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.