Abstract

BackgroundIdentifying quantitative trait loci (QTL) for both additive and epistatic effects raises the statistical issue of selecting variables from a large number of candidates using a small number of observations. Missing trait and/or marker values prevent one from directly applying the classical model selection criteria such as Akaike's information criterion (AIC) and Bayesian information criterion (BIC).ResultsWe propose a two-step Bayesian variable selection method which deals with the sparse parameter space and the small sample size issues. The regression coefficient priors are flexible enough to incorporate the characteristic of "large p small n" data. Specifically, sparseness and possible asymmetry of the significant coefficients are dealt with by developing a Gibbs sampling algorithm to stochastically search through low-dimensional subspaces for significant variables. The superior performance of the approach is demonstrated via simulation study. We also applied it to real QTL mapping datasets.ConclusionThe two-step procedure coupled with Bayesian classification offers flexibility in modeling "large p small n" data, especially for the sparse and asymmetric parameter space. This approach can be extended to other settings characterized by high dimension and low sample size.

Highlights

  • Identifying quantitative trait loci (QTL) for both additive and epistatic effects raises the statistical issue of selecting variables from a large number of candidates using a small number of observations

  • This article extends the Bayesian framework in Zhang et al [28] to identify both additive and epistatic effects of QTL based on model (1)

  • The advantage of this approach mainly lies in the flexible priors for the regression coefficients by accounting for some characteristics of "large p small n" data, the predictability of a model constructed with size n data, and the two step strategy for dimension reduction

Read more

Summary

Introduction

Identifying quantitative trait loci (QTL) for both additive and epistatic effects raises the statistical issue of selecting variables from a large number of candidates using a small number of observations. With the advent of high-throughput biotechnologies to genotype dense molecular markers throughout the genome, statistical methodologies are crucial in understanding the genetic architecture of complex traits, and in locating genes underlying important traits. Traditional approaches to QTL mapping test each of dense grid loci on chromosomes via the likelihood ratios of linear regression models (see the reviews by Doerge et al [2] and Broman and Speed [3]), and Wang et al [4] proposed a Bayesian shrinkage estimation of QTL parameters allowing varying shrinkage factors across different effects. Even a moderate number of markers implies a large number of pairwise combinations, creating statistical issues in QTL mapping. Due to the small sample sizes and the lack of efficient statistical tools, the number (page number not for citation purposes)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call