Abstract

We apply an analysis based upon mixed-models to the Genetic Analysis Workshop 15, Problem 3 simulated data. Such models are commonly used to mitigate the tendency for population structure, or cryptic relatedness, to inflate the false-positive rate of test statistics. They also allow for explicit modeling of varying degrees of relatedness in samples in which some individuals are related by (possibly unknown) pedigree, whereas others are not. Furthermore, the implementation of the method we describe here is quick enough to be used effectively on genome-wide data. We present an analysis of the data for Genetic Analysis Workshop 15, Problem 3, in which we show that these methods can effectively find signals in this data. Somewhat disappointingly, the false-positive rate does not appear to be reduced, but this is largely because the method used to simulate the data appears not to have encompassed effects, such as population stratification, that might have led to inflation of p-values.

Highlights

  • A major issue when analyzing genome-wide data is that of false-positive signals. This is caused by the large number of loci that are typically analyzed in such studies. It is often caused by the effects of population stratification [1,2] or cryptic relatedness [3]

  • Yu et al [12] introduced a mixed-model approach suitable for application in a genome-wide context, in which relatedness was estimated via genome-wide marker data. We extended this approach and applied it to data for Arabidopsis thaliana in which correlation between phenotypic distribution and population structure is high, and saw that the mixed-model approach greatly reduced the effects of population structure on false-positive rates [8]

  • A combined SNP data set with both of the 1500 affected sib pair (ASP) families and the 2000 unrelated control subjects is analyzed for each replicate

Read more

Summary

Introduction

A major issue when analyzing genome-wide data is that of false-positive signals. In part, this is caused by the large number of loci that are typically analyzed in such studies. This is caused by the large number of loci that are typically analyzed in such studies It is often caused by the effects of population stratification [1,2] or cryptic relatedness [3]. We apply mixed-model methods that have been developed to reduce the adverse effects of population structure, whether caused by geographical structure of populations, or relatedness (either observed or unobserved) between individuals. In Campbell et al [5] a SNP in the gene LCT that is totally unrelated to height showed strong association with height in a study in a European American population

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.