Abstract

In whole-genome association studies, at the first stage, all markers are tested for association and their test statistics or p-values are ranked. At the second stage, some most significant markers are further analyzed by more powerful statistical methods. This helps reduce the number of hypotheses to be corrected for in multiple testing. Ranks of true associations in genome-wide scans using a single test statistic have been studied. In a case-control design for association, the trend test has been proposed. However, three different trend tests, optimal for the recessive, additive, and dominant models, respectively, are available for each marker. Because the true genetic model is unknown, we rank markers based on multiple test statistics or test statistics robust to model mis-specification. We studied this problem with application to Problem 3 of Genetic Analysis Workshop 15. An independent simulation study was also conducted to further evaluate the proposed procedure.

Highlights

  • For a large genetic study, a two-stage analysis is often employed

  • Using the first simulated data set of Problem 3 from Genetic Analysis Workshop (GAW) 15, we study robust ranking when the underlying genetic model is unknown and examine whether robust test statistics would lead to robust rankings of about 10 K

  • Rather than ranking M SNPs based on any single CochranArmitage trend test (CATT), we propose ranking the SNPs by the MERT and the minimum of the p-values

Read more

Summary

Introduction

For a large genetic study, a two-stage analysis is often employed. At the first stage, each marker is tested for association with a disease. Some of the most significant markers are analyzed in the second stage This two-stage analysis reduces the number of hypotheses to be tested in the second stage. It is important to know how many of the most significant markers one should study in the second stage so that the probability that one or several true markers will be studied in the second stage is greater than a given value. On the other hand, when a given number of the most significant markers is selected, it is important to know the probability that this list of markers would contain one or more true markers. A small list of the most significant markers may not contain any true markers at all, which leads to spurious associations or negative findings in the second stage

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call