Abstract

Genome-wide association studies (GWAS) have found hundreds of novel loci associated with full blood count (FBC) phenotypes. However, most of these studies were performed in a single phenotype framework without putting into consideration the clinical relatedness among traits. In this work, in addition to the standard univariate GWAS, we also use two different multivariate methods to perform the first multiple traits GWAS of FBC traits in ∼7000 individuals from the Ugandan General Population Cohort (GPC). We started by performing the standard univariate GWAS approach. We then performed our first multivariate method, in this approach, we tested for marker associations with 15 FBC traits simultaneously in a multivariate mixed model implemented in GEMMA while accounting for the relatedness of individuals and pedigree structures, as well as population substructure. In this analysis, we provide a framework for the combination of multiple phenotypes in multivariate GWAS analysis and show evidence of multi-collinearity whenever the correlation between traits exceeds the correlation coefficient threshold of r2 >=0.75. This approach identifies two known and one novel loci. In the second multivariate method, we applied principal component analysis (PCA) to the same 15 correlated FBC traits. We then tested for marker associations with each PC in univariate linear mixed models implemented in GEMMA. We show that the FBC composite phenotype as assessed by each PC expresses information that is not completely encapsulated by the individual FBC traits, as this approach identifies three known and five novel loci that were not identified using both the standard univariate and multivariate GWAS methods. Across both multivariate methods, we identified six novel loci. As a proof of concept, both multivariate methods also identified known loci, HBB and ITFG3. The two multivariate methods show that multivariate genotype-phenotype methods increase power and identify novel genotype-phenotype associations not found with the standard univariate GWAS in the same dataset.

Highlights

  • Multivariate linear mixed models have been extensively used in a range of genetics studies (Yu et al, 2006; Kang et al, 2008, 2010; Zhang et al, 2010; Lippert et al, 2011; Loh et al, 2015; Hackinger and Zeggini, 2017)

  • Five novel association signals were identified using PC-Genome-wide association studies (GWAS) method (Table 4 and Supplementary Figure 2). It found two known associations (HBB and ITFG3) that had been previously reported to be associated with at least one of the fifteen full blood count (FBC) traits. These known associations were identified with Univariate GWAS Method (UV-GWAS) and MV-GWAS approaches and were described in the Sections “ITFG3 and R3HDM1.”

  • The locus enlarged epidermal growth factor receptor (EGFR) surface abundance and reduced homologous recombination repair frequency, the Negative genetic interaction between MUS81−/− and MUS81+/+, Decreased viability, Increased vaccinia virus (VACV) infection (Sivan et al, 2013) The gene is expressed in the lymph node, colon, bladder, whole blood, among other organs

Read more

Summary

Introduction

Multivariate linear mixed models have been extensively used in a range of genetics studies (Yu et al, 2006; Kang et al, 2008, 2010; Zhang et al, 2010; Lippert et al, 2011; Loh et al, 2015; Hackinger and Zeggini, 2017). This approach has attracted substantial topical interest in GWAS. We applied a two way complementary multivariate GWAS strategies in nearly 5000 genotyped samples and validation of the associated genetic variants in ∼2000 individuals with whole genome sequencing (WGS) sampled from Ugandan General Population Cohort (GPC)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.