Abstract

In this study, we suggested a hypothesis test method that was robust to different genotype encodings in a genome-wide association analysis of continuous traits. When the population stratification is corrected for using a method based on principal component analysis, ordinally (or categorically) encoded genotypes are adjusted and turn into continuous values. Due to the adjustment of the encoded genotype, the association test result using conventional methods, such as the test of Pearson’s correlation coefficient, was shown to be dependent on how genotypes were encoded. To overcome this shortcoming, we proposed a non-parametric test based on Kendall’s tau. Because Kendall’s tau deals with rank, rather than value, associations between adjusted genotype and phenotype values, Kendall’s test can be more robust than Pearson’s test under different genotype encodings. We assessed the robustness of Kendall’s test and compared with that of Pearson’s test in terms of the difference in p-values obtained by using different genotype encodings. With simulated as well as real data set, we demonstrated that Kendall’s test was more robust than Pearson’s test under different genotype encodings. The proposed method can be applicable to the broad topics of interest in population genetics and comparative genomics, in which novel genetic variants are associated with traits. This study may also bring about a cautious approach to the genotype encoding in the numerical analysis.

Highlights

  • A genome-wide association study (GWAS) includes the statistical test of associations between genetic variants, such as SNPs, and phenotypes of interest across samples

  • Under the assumption that a principal component analysis (PCA)-based method was adopted to correct for the population structure, in this study, we addressed the robustness of the association test methods under different genotype encoding schemes

  • A “good” association test method should produce test results that are robust to different genotype encodings

Read more

Summary

Introduction

A genome-wide association study (GWAS) includes the statistical test of associations between genetic variants, such as SNPs, and phenotypes (or traits) of interest across samples. In the past 10 years since GWAS were first introduced, about 10,000 robust associations with disease, disorder, and other genomic traits have been discovered [1]. Detecting associations between genetic variants and traits depend on many factors such as the sample size, the frequency of the genetic variants, and the linkage disequilibrium between the observed and unknown causal variants. The population stratification (or population structure) should be taken into account to avoid spurious associations caused by the genetic differences in samples from. A consistent approach to the genotype encoding problem

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.