Abstract

Motivation: In order to discover quantitative trait loci, multi-dimensional genomic datasets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes with millions of genetic variants while appropriately controlling for multiple testing.Results: We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted P-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot datasets can be now easily analyzed an order of magnitude faster than previous approaches.Availability and implementation: Source code, binaries and comprehensive documentation of FastQTL are freely available to download at http://fastqtl.sourceforge.net/Contact: emmanouil.dermitzakis@unige.ch or olivier.delaneau@unige.chSupplementary information: Supplementary data are available at Bioinformatics online.

Highlights

  • Genome-wide association studies have shown that most common trait-associated variants fall into non-coding genomic regions and likely alter gene regulation (Maurano et al, 2012; Nica et al, 2010)

  • To perform a comprehensive evaluation of FastQTL, we used RNAseq and genotype data produced by both the Geuvadis (Lappalainen et al, 2013; Supplementary material 1) and the GTEx consortia (GTEx Consortium, 2015; Supplementary material 2), two of the largest eQTL studies performed to date

  • We looked at the maximum likelihood (ML) estimate distributions of these parameters across all genes in the GEUV_EUR dataset and find first that parameter k values tend to center around 1.0, in line with what is expected for the top variant (Fig. 1a)

Read more

Summary

Introduction

Genome-wide association studies have shown that most common trait-associated variants fall into non-coding genomic regions and likely alter gene regulation (Maurano et al, 2012; Nica et al, 2010) This has motivated large-scale studies to catalog candidate regulatory variants (quantitative trait loci; QTLs) associated with various molecular phenotypes (i.e. quantitative molecular traits with a genomic location) across various populations (Lappalainen et al, 2013), cell (Fairfax et al, 2012) and tissue types (GTEx Consortium, 2015; Ongen et al, 2014). Alternative approaches have been developed to increase discovery power by accounting for confounding factors (Fusi et al, 2012), integrating functional annotations (Gaffney et al, 2012), leveraging allelic imbalance (van de Geijn et al, 2015) or aggregating measurements across multiple tissues (Flutre et al, 2013) This requires millions of association tests in order to scan all possible phenotype-variant pairs in cis (i.e. variants located within a specific window around a phenotype), resulting in millions of nominal P-values. Matrix eQTL (Shabalin, 2012) has recently emerged as a ‘gold standard’ for this task (GTEx Consortium, 2015; Lappalainen et al, 2013) by taking advantage of efficient matrix

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.