Abstract

Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the ‘Prioritization of candidate causal Genes at Molecular QTLs’ (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98% of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69% of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of ‘true positive’ causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.

Highlights

  • With the continued application of genome-wide association studies (GWAS) to human disease aetiology [1,2,3,4], the rapid discovery rate of susceptibility loci is far outstripping the rate at which we are able to elucidate the biological mechanisms underlying the identified loci

  • The framework of ProGeM is based on the assumption that in order for a gene to be causal for a molecular Quantitative trait locus (QTL), or any other phenotype, it must fulfil two requirements: (i) the gene product must exhibit altered structure, abundance or function as a result of the sentinel or proxy variants at the QTL and (ii) the gene must be involved in the molecular mechanism that influences the trait in question

  • When we investigated in more detail the 24 metabolite QTL (mQTL) for which the true positive causal gene contained a moderate impact sentinel variant, we found that just nine of these genes were cis-expression QTLs (eQTLs) genes for either the sentinel or a proxy variant

Read more

Summary

Introduction

With the continued application of genome-wide association studies (GWAS) to human disease aetiology [1,2,3,4], the rapid discovery rate of susceptibility loci is far outstripping the rate at which we are able to elucidate the biological mechanisms underlying the identified loci. This represents a major bottleneck to translational progress. Quantitative trait locus (QTL) mapping of molecular, intermediate phenotypes provides a powerful means to functionally annotate and characterize GWAS signals for complex traits in a highthroughput manner. This catalogue of molecular QTLs, cutting across multiple ‘omic modalities, can be readily queried to elucidate the functional impact of disease-associated variants on the abundance of transcripts, and epigenetic marks, proteins, lipids and metabolites

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call