Abstract

Exploration of genetic variant-to-gene relationships by quantitative trait loci such as expression QTLs is a frequently used tool in genome-wide association studies. However, the wide range of public QTL databases and the lack of batch annotation features complicate a comprehensive annotation of GWAS results. In this work, we introduce the tool “Qtlizer” for annotating lists of variants in human with associated changes in gene expression and protein abundance using an integrated database of published QTLs. Features include incorporation of variants in linkage disequilibrium and reverse search by gene names. Analyzing the database for base pair distances between best significant eQTLs and their affected genes suggests that the commonly used cis-distance limit of 1,000,000 base pairs might be too restrictive, implicating a substantial amount of wrongly and yet undetected eQTLs. We also ranked genes with respect to the maximum number of tissue-specific eQTL studies in which a most significant eQTL signal was consistent. For the top 100 genes we observed the strongest enrichment with housekeeping genes (P = 2 × 10–6) and with the 10% highest expressed genes (P = 0.005) after grouping eQTLs by r2 > 0.95, underlining the relevance of LD information in eQTL analyses. Qtlizer can be accessed via https://genehopper.de/qtlizer or by using the respective Bioconductor R-package (https://doi.org/10.18129/B9.bioc.Qtlizer).

Highlights

  • In the past decade, genome-wide association studies (GWAS) led to the discovery of tens of thousands of associations of genetic loci in human with variation in traits and diseases

  • It is very cumbersome to comprehensively annotate lists of variants or genes with immediate results. Another aspect that the current databases do not sufficiently consider, is that for both GWAS and QTL studies, true association signals are very often accompanied by multiple other variants, which can be attributed to the linkage disequilibrium (LD) structure in the human population

  • The significance was determined by adjusting for multiple testing using a family-wise error rate (FWER) of 5%, resulting in an adjusted significance level of P = 10–12 which corresponds to 1000,000 variants and 50,000 genes. expression quantitative trait locus (eQTL) objects that passed the study-wide significance threshold were flagged as “is_sw_significant”

Read more

Summary

Introduction

Genome-wide association studies (GWAS) led to the discovery of tens of thousands of associations of genetic loci in human with variation in traits and diseases. It is very cumbersome to comprehensively annotate lists of variants or genes with immediate results Another aspect that the current databases do not sufficiently consider, is that for both GWAS and QTL studies, true association signals are very often accompanied by multiple other variants, which can be attributed to the linkage disequilibrium (LD) structure in the human population. Apart from the aforementioned existing QTL platforms many other tools exist for annotating and characterizing variants resulting from GWAS studies, for example F­ UMA14 and S­ NiPA15 These tools, target a more general annotation of variants and are restricted to a small number of eQTL datasets. We did not compare them with Qtlizer in more detail in this work

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call