False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors

Ashis Saha,Alexis Battle

doi:10.12688/f1000research.17145.2

Ashis Saha, Alexis Battle

Open Access

https://doi.org/10.12688/f1000research.17145.2

Copy DOI

Journal: F1000Research	Publication Date: Apr 8, 2019
Citations: 50	License type: CC BY 4.0

Affiliation: Johns Hopkins University

Abstract

Sequence similarity among distinct genomic regions can lead to errors in alignment of short reads from next-generation sequencing. While this is well known, the downstream consequences of misalignment have not been fully characterized. We assessed the potential for incorrect alignment of RNA-sequencing reads to cause false positives in both gene expression quantitative trait locus (eQTL) and co-expression analyses. Trans-eQTLs identified from human RNA-sequencing studies appeared to be particularly affected by this phenomenon, even when only uniquely aligned reads are considered. Over 75% of trans-eQTLs using a standard pipeline occurred between regions of sequence similarity and therefore could be due to alignment errors. Further, associations due to mapping errors are likely to misleadingly replicate between studies. To help address this problem, we quantified the potential for "cross-mapping'' to occur between every pair of annotated genes in the human genome. Such cross-mapping data can be used to filter or flag potential false positives in both trans-eQTL and co-expression analyses. Such filtering substantially alters the detection of significant associations and can have an impact on the assessment of false discovery rate, functional enrichment, and replication for RNA-sequencing association studies.

Highlights

Sequence similarity among distinct genomic regions makes alignment of short sequencing reads difficult[1,2]
We focus on evidence that sequence similarity between pairs of genes and resulting alignment errors between them may lead to false positives in association studies from RNA-sequencing (RNA-seq) data, in expression quantitative trait locus and co-expression analyses. eQTL studies, revealing associations between genetic variants and gene expression levels, have contributed to a greater understanding of gene regulation and genetics of complex traits[7,8,9]
Effect of cross-mappability on trans-eQTL detection To investigate the effects of alignment errors on trans-eQTL detection, we performed a standard trans-eQTL analysis using data from the Genotype-Tissue Expression (GTEx) project for five human tissues

Summary

Introduction

Sequence similarity among distinct genomic regions makes alignment of short sequencing reads difficult[1,2]. We focus on evidence that sequence similarity between pairs of genes and resulting alignment errors between them may lead to false positives in association studies from RNA-sequencing (RNA-seq) data, in expression quantitative trait locus (eQTL) and co-expression analyses. A variant associated with expression of Gene A may appear to be associated with Gene B, giving rise of a false positive trans-eQTL. We note that such errors are not entirely mitigated by filtering multi-mapped reads—some alignment errors may remain between similar regions even among uniquely aligned reads due to genetic variation, errors in the reference genome, and other complications

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research

Lead the way for us

Similar Papers

False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors.
Ashis Saha ... Alexis Battle
F1000Research | VOL. 7
Ashis Saha, et. al.Ashis Saha ... Alexis Battle
28 Nov 2018
F1000Research | VOL. 7

Gene expression quantitative trait locus analysis of 16 000 barley genes reveals a complex pattern of genome‐wide transcriptional regulation
Elena Potokina ... Zewei Luo
The Plant Journal | VOL. 53
Elena Potokina, et. al.Elena Potokina ... Zewei Luo
19 Sep 2007
The Plant Journal | VOL. 53

Integrative Analyses Identify KCNJ15 as a Candidate Gene in Patients with Epilepsy.
Shitao Wang ... Mengen Zhang
Neurology and therapy | VOL. 11
Shitao Wang, et. al.Shitao Wang ... Mengen Zhang
28 Sep 2022
Neurology and therapy | VOL. 11

GANDAFL: Dataflow Acceleration for Short Read Alignment on NGS Data
Konstantina Koliogeorgi ... Dimitrios Soudris
IEEE Transactions on Computers | VOL. 71
Konstantina Koliogeorgi, et. al.Konstantina Koliogeorgi ... Dimitrios Soudris
01 Nov 2022
IEEE Transactions on Computers | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research