Abstract

The commonly used statistical methods in medical research generally assume patients arise from one homogeneous population. However, the existence and importance of significant heterogeneity have been widely documented. It is well known that common and complex human diseases usually have heterogeneous disease etiology, which often involves interplay of multiple genetic and environmental factors, leading to latent population substructure. Genome-wide association studies (GWAS) is a useful tool to uncover genetic association with disease of interest, while linkage analysis is a commonly used method to identify statistical association between the inheritance of a human disease and inheritance of marker loci that are in linkage with disease causing loci. We propose a likelihood ratio test for genome-wide linkage analysis under genetic heterogeneity using family data. We derive a closed-form formula for the LRT test statistic and provide explicit asymptotic null distribution. The closed form asymptotic distribution allows easy determination of the asymptotic p-values. Our extensive simulation studies indicate that the proposed test has proper type I error and good power under genetic heterogeneity. In order to simplify application of the proposed method for non-statisticians, we develop an R package gLRTH to implement the proposed LRT for genome-wide linkage analysis as well as Qian and Shao’s LRT for GWAS under heterogeneity. The newly developed open source R package gLRTH is available at CRAN.

Highlights

  • [15] LRT-H for Genome-wide association studies (GWAS), in this paper we propose a powerful and computational efficient likelihood ratio test under genetic heterogeneity for linkage analysis based on a binomial mixture model, using family data with parental marker genotypes and genotypes of two affected siblings

  • The required arguments are: 1) n0: Number of affected sibling pairs that both inherited A from their heterozygous parent Aa 2) n1: Number of affected sibling pairs that one inherited A and the other inherited a from their heterozygous parent Aa 3) n2: Number of affected sibling pairs that both inherited a from their heterozygous parent Aa To illustrate the gLRTH_L function, suppose we have hypothetical genetic marker M1/M2 information from a sample of n = 1000 independent families, with M2 be the marker of interest

  • The commonly used statistical methods in medical research often assume patients arise from one homogeneous population

Read more

Summary

Introduction

The current available genetic linkage methods that account for latent genetic heterogeneity are based on mixture models and generally are computational expensive for genome-wide or NGS data [13] [23] [24] [25], yet ignoring heterogeneity can cause loss of efficiency in statistical test with increased numbers of false negative findings or missed opportunities. Motivated by the Qian and Shao’s [15] LRT-H for GWAS, in this paper we propose a powerful and computational efficient likelihood ratio test under genetic heterogeneity for linkage analysis based on a binomial mixture model, using family data with parental marker genotypes and genotypes of two affected siblings.

Methods
Mixture Binomial and Maximum Likelihood
The Likelihood Ratio Test
Type I Errors
Power Comparison
The R Package Description and Examples
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.