Abstract

Motivation: Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem.Results: Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype–phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis.Conclusion: MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype–phenotype models.Availability and implementation: MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis.Contact: mxli@hku.hkSupplementary information: Supplementary data are available at Bioinformatics online.

Highlights

  • Standard genome-wide association studies (GWAS) involve the univariate regression of one trait on a large number of genetic variants, while adapting the nominal a criterion level for the extensive multiple testing

  • Noteworthy: the variance–covariance structure of data generated through a network model may closely mimic the variancecovariance structure of data generated through a 1-factor model (Van der Sluis et al, 2013; Van der Maas et al, 2006), implicating that factor analytic results are not sufficient to determine the true trait-generating model.) Various simulation studies have shown that overreliance on the unidimensional trait-generating model, and the associated use of univariate composite scores, can result in considerable loss of statistical power to detect genetic variants (Medland and Neale, 2010; Minica et al, 2010; Van der Sluis et al, 2010, 2013)

  • GATES analyses based on the nine individual continuous metabolic phenotypes (Supplementary Table S13), yielded results largely similar to those obtained with MGAS, except that the MGAS P-values are properly corrected for the phenotypic correlations and multiple testing

Read more

Summary

Introduction

Standard genome-wide association studies (GWAS) involve the univariate regression of one trait on a large number of genetic variants (single nucleotide polymorphisms, i.e. SNPs), while adapting the nominal a criterion level for the extensive multiple testing (typically a 1⁄4 5 Â 10À8). This analysis is limited in two important ways. Whether univariate composite scores exhaustively summarize all information in the multivariate data (i.e. are sufficient statistics) depends on the true trait-generating genotype–phenotype model, i.e. the model that describes how the multiple phenotypes and genes jointly generate the observed trait (Van der Sluis et al, 2010, 2013). MGAS is implemented in knowledgebased mining system for genome-wide genetic studies (KGG v3.0), is freely available (http://statgenpro.psychiatry.hku.hk/limx/kgg/ download.php), and has a user-friendly graphical interface for loading P-value files and genetic and phenotypic correlational information, and for visualizing results and annotating sequence variants and interesting genes

The MGAS algorithm
MGAS running time: the divide-and-conquer algorithm
Type I error rate and power: simulation
Particulars MGAS
Implementation: metabolism data
Findings
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.