Abstract

With the availability of high-density genomic data containing millions of single nucleotide polymorphisms and tens or hundreds of thousands of individuals, genetic association study is likely to identify the variants contributing to complex traits in a genome-wide scale. However, genome-wide association studies are confounded by some spurious associations due to not properly interpreting sample structure (containing population structure, family structure and cryptic relatedness). The absence of complete genealogy of population in the genome-wide association studies model greatly motivates the development of new methods to correct the inflation of false positive. In this process, linear mixed model based approaches with the advantage of capturing multilevel relatedness have gained large ground. We summarize current literatures dealing with sample structure, and our review focuses on the following four areas: (i) The approaches handling population structure in genome-wide association studies; (ii) The linear mixed model based approaches in genome-wide association studies; (iii) The performance of linear mixed model based approaches in genome-wide association studies and (iv) The unsolved issues and future work of linear mixed model based approaches.

Highlights

  • The recent breakthrough in genotyping technology induces the high density genome-wide collection, allowing researchers to access to an extraordinarily large number of single nucleotide polymorphisms (SNPs), even those newly identified markers in a fast and cost efficient way

  • It is well known that genome-wide association studies (GWAS) may be confronted by the inflated false positive rates if the population structure, which is derived from individuals from different populations within one study, is not properly corrected in the model [6, 7]

  • New approaches have been prevalently developed based on the linear mixed models (LMM), and the principle of this strategy makes the interpretation of sample structure possible in GWAS

Read more

Summary

INTRODUCTION

The recent breakthrough in genotyping technology induces the high density genome-wide collection, allowing researchers to access to an extraordinarily large number of single nucleotide polymorphisms (SNPs), even those newly identified markers in a fast and cost efficient way. New approaches have been prevalently developed based on the linear mixed models (LMM), and the principle of this strategy makes the interpretation of sample structure possible in GWAS. It estimates the genetic similarity between a pair of individuals to account for the genealogy of population. 28 The Open Bioinformatics Journal, 2013, Volume 7 wide association studies These methods include once dominant approaches interpreting partial sample structure and new approaches using linear mixed models to capture the genealogy of population in GWAS. The unsolved questions and future works of new LMM-based approaches are discussed

THE APPROACHES HANDLING POPULATION STRUCTURE IN GWAS
The Linear Mixed Model
The Development of LMM-Based Approaches
THE PERFORMANCE OF LMM-BASED APPROACHES IN GWAS
Method
THE UNSOLVED ISSUES AND FUTURE WORK OF LMM-BASED APPROACHES
DISCUSSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call