Abstract

In the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare variants. Often, in analyzing such studies, potentially confounding factors, such as social and environmental conditions, are required to be involved. Multiple linear regression is the most widely used type of regression analysis when the outcome of interest is quantitative traits. Many statistical tests for identifying genotype-phenotype associations using linear regression rely on the assumption that the traits (or the residuals) of the regression follow a normal distribution. In genomic research, the rank-based inverse normal transformation (INT) is one of the most popular approaches to reach normally distributed traits (or normally distributed residuals). Many researchers believe that applying the INT to the non-normality of the traits (or the non-normality of the residuals) is required for valid inference, because the phenotypic (or residual) outliers and non-normality have the significant influence on both the type I error rate control and statistical power, especially under the situation in rare-variant association testing procedures. Here we propose a test for exploring the association of the rare variant with the quantitative trait by using a fully adjusted full-stage INT. Using simulations we show that the fully adjusted full-stage INT is more appropriate than the existing INT methods, such as the fully adjusted two-stage INT and the INT-based omnibus test, in testing genotype-phenotype associations with rare variants, especially when genotypes are uncorrelated with covariates. The fully adjusted full-stage INT retains the advantages of the fully adjusted two-stage INT and ameliorates the problems of the fully adjusted two-stage INT for analysis of rare variants under non-normality of the trait. We also present theoretical results on these desirable properties. In addition, the two available methods with non-normal traits, the quantile/median regression method and the Yeo-Johnson power transformation, are also included in simulations for comparison with these desirable properties.

Highlights

  • In recent years, there has been growing interest in using next-generation sequencing technologies to discovery causal rare variants associated with complex human disease and traits

  • To describe the fully adjusted full-stage inverse normal transformation (INT) approach, we first present the existing methods of the fully adjusted two-stage INT procedure introduced by Sofer et al [5] and the partly adjusted two-stage INT procedure that is widely used in genome-wide association studies

  • When the INT-transformed residuals, RNð^εiÞ; i 1⁄4 1; 2; n; in Stage 2 follow a normal distribution with zero mean and finite variance, only Stages 1–2 and Stage 5 of the fully adjusted full-stage INT method are used for testing the single nucleotide polymorphism (SNP) effect, which in turn means that the fully adjusted full-stage INT

Read more

Summary

Introduction

There has been growing interest in using next-generation sequencing technologies to discovery causal rare variants associated with complex human disease and traits. Sofer et al [5] showed that such a partly adjusted two-stage INT results in these undesirable statistical properties because of a mis-specified mean-variance relationship for the genetic effect. It is necessary to investigate how transformations and covariate-variant relationships interact to impact on genetic effects and to provide a comprehensive framework for studying genetic association analysis for rare variants with quantitative traits using the INT-based procedures. In this investigation, we propose a test by using a fully adjusted full-stage INT approach for detecting the association of rare (and common) variants with a quantitative trait under the situations with departure of the trait distribution from normality. The two available methods with non-normal traits, the median regression method and the Yeo-Johnson power transformation, are included in simulations for comparison with these desirable properties

Materials and methods
Results
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call