Abstract

BackgroundAdvancements in statistical methods and sequencing technology have led to numerous novel discoveries in human genetics in the past two decades. Among phenotypes of interest, most attention has been given to studying genetic associations with continuous or binary traits. Efficient statistical methods have been proposed and are available for both types of traits under different study designs. However, for multinomial categorical traits in related samples, there is a lack of efficient statistical methods and software.ResultsWe propose an efficient score test to analyze a multinomial trait in family samples, in the context of genome-wide association/sequencing studies. An alternative Wald statistic is also proposed. We also extend the methodology to be applicable to ordinal traits. We performed extensive simulation studies to evaluate the type-I error of the score test, Wald test compared to the multinomial logistic regression for unrelated samples, under different allele frequency and study designs. We also evaluate the power of these methods. Results show that both the score and Wald tests have a well-controlled type-I error rate, but the multinomial logistic regression has an inflated type-I error rate when applied to family samples. We illustrated the application of the score test with an application to the Framingham Heart Study to uncover genetic variants associated with diabesity, a multi-category phenotype.ConclusionBoth proposed tests have correct type-I error rate and similar power. However, because the Wald statistics rely on computer-intensive estimation, it is less efficient than the score test in terms of applications to large-scale genetic association studies. We provide computer implementation for both multinomial and ordinal traits.

Highlights

  • Advancements in statistical methods and sequencing technology have led to numerous novel discoveries in human genetics in the past two decades

  • We propose a computationally efficient score test based on extended generalized estimating equations (EGEE) for large-scale genetics studies of multi-category phenotypes accounting for familial correlation

  • We have generated QQ-plots (Additional File 3) for the robust score test and the simplified score test for results from all minor allele frequencies (MAF) scenarios for both multinomial and ordinal traits when applied to family-based samples

Read more

Summary

Introduction

Advancements in statistical methods and sequencing technology have led to numerous novel discoveries in human genetics in the past two decades. Their method utilized adaptive Gaussian quadrature to approximate the maximum loglikelihood and a likelihood ratio test was proposed to test the hypothesis of no association between a genetic variant and an ordinal trait of interest This approach has not been widely used due to lack of computer-efficient software and the fact that the likelihood ratio test is computationally intensive. Bi and colleagues [5] proposed a computer-efficient framework (POLMM), for ordinal traits Because it doesn’t allow for a userprovided kinship matrix, such as the one estimated from pedigree or using a typical genetic software, this will be a limitation for family-based cohort studies with known relationships. We evaluate our approach using simulations and apply it to a genome-wide scan to identify genetic variants associated with diabesity, a four-category phenotype, with the healthy referent category being no diabetes and no obesity and the unhealthiest category, “diabese” (diabetes and obesity), having a prevalence of at least 25% in several countries [6]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.